Measuring Password Strength: An Empirical Analysis

We present an in-depth analysis on the strength of the almost 10,000 passwords from users of an instant messaging server in Italy. We estimate the strength of those passwords, and compare the effectiveness of state-of-the-art attack methods such as d…

Authors: Matteo DellAmico, Pietro Michiardi, Yves Roudier

Measuring Password Strength: An Empirical Analysis
Measuring P assw ord Strength: An Empirial Analysis Matteo Dell'Amio, Pietro Mi hiardi and Y v es Roudier Institut Eureom {matteo.dell-amio, pietro.mi hiardi, yv es.roudier}eureom.fr Otob er 23, 2018 Abstrat W e presen t an in-depth analysis on the strength of the almost 10,000 passw ords from users of an in- stan t messaging serv er in Italy . W e estimate the strength of those passw ords, and ompare the ef- fetiv eness of state-of-the-art atta k metho ds su h as ditionaries and Mark o v  hain-based te hniques. W e sho w that the strength of passw ords  ho- sen b y users v aries enormously , and that the ost of atta ks based on passw ord strength gro ws v ery qui kly when the atta k er w an ts to obtain a higher suess p eren tage. In aordane with existing studies w e observ e that, in the absene of mea- sures for enforing passw ord strength, w eak pass- w ords are ommon. On the other hand w e diso v er that there will alw a ys b e a subset of users with ex- tremely strong passw ords that are v ery unlik ely to b e brok en. The results of our study will help in ev aluat- ing the seurit y of passw ord-based authen tiation means, and they pro vide imp ortan t insigh ts for in- spiring new and b etter proativ e passw ord  he k ers and passw ord reo v ery to ols. 1 In tro dution Ev en though m u h has b een said ab out their w eak- nesses, passw ords still are  and will b e in the fore- seeable future  ubiquitous in omputer authen ti- ation systems. A p euliar  harateristis of pass- w ords is that they inheren tly arry a trade-o b e- t w een usabilit y and seurit y: while strong pass- w ords are hard for atta k ers to guess, they are on the other hand also diult for the user to re- mem b er. As Ri hard Smith parado xially notes, passw ord b est praties imply that the passw ord m ust b e imp ossible to remem b er and nev er written do wn [17 ℄. In ligh t of this, it is not v ery surprising that users often kno wingly  ho ose to use w eak pass- w ords or irum v en t seurit y b est praties, sine they p ereiv e that follo wing them w ould get in the w a y of doing their w ork [1, 15 ℄. T o think sensibly ab out the seurit y of systems that use passw ords, it is therefore essen tial to an- alyze the  harateristis of passw ords  hosen b y users. In this w ork, w e analyze a large dataset on- taining all user passw ords from an instan t messag- ing serv er lo ated in Italy . Unlik e previous empiri- al studies on passw ords [10 , 7, 18 , 11 , 3, 4 , 9℄, this pap er ev aluates the strength of passw ords against a v ariet y of state of the art te hniques for andidate generation. The analysis w e onduted b eneted from ha ving aess to the passw ords in unenrypted form; this made it p ossible to measure the strength of all of them, inluding those that w ould hardly b e ra k ed ev en b y extremely p o w erful atta k ers. W e ev aluate the strength of a passw ord in terms of their asso iated sear h spae size, that is the n um b er of attempts that an atta k er w ould need to orretly guess it. This measure do es not de- p end on the partiular nature of the authen tiation system nor on the atta k er apabilities: it is only related to the atta k te hnique and to the w a y users  ho ose their passw ord. The atta k mo del and the  harateristis of the system will instead dene the ost that the atta k er has to pa y for ea h single guess. By om bining this ost with our measures of passw ord strength, it b eomes p ossible to obtain a sound ost-b enet analysis for atta ks based on passw ord guessing on an authen tiation system. As w e will sho w, dieren t atta k te hniques are advisable dep ending on the sear h spae size that the atta k er an aord to explore. This has to b e 1 tak en in to aoun t when prop osing and ev aluating new te hniques for reduing the sear h spae: they ma y b e eetiv e only if the strength of the atta k falls within a giv en in terv al. W e sho w that passw ord strength has an ex- tremely wide v ariane: as a rst appro ximation, the probabilit y to guess a passw ord at ea h attempt dereases roughly exp onen tially as the size of the explored sear h spae gro ws. These diminishing re- turns imply that, in most ases, an atta k er w ould ev en tually nd a p oin t where the ost of on tin uing the atta k w ould not b e justied b y the probabil- it y of suess. This study pro vides gures that an help designers and administrators in assessing the seurit y of their systems b y ev aluating where that p oin t resides. 2 Related W ork In this setion w e pro vide a short review of studies ab out passw ord seurit y , and mak e the ase for the imp ortane of measuring passw ord strength. A t- ta ks su h as phishing or so ial engineering, where the user is misled in omm uniating the passw ord to the atta k er, are unrelated to passw ord strength and therefore outside the sop e of this w ork. Priing Via Pro essing T o defend against in- truders who rep eatedly try passw ord after passw ord un til they obtain aess to the system, it is p ossible to limit the rate at whi h the atta k er is allo w ed to try new passw ords b y requiring the user to p erform an ation with a mo derate ost. While legitimate users w ould need to p erform this ation only one ev ery time they try to log on, an atta k er w ould need to rep eat this pro ess man y times, resulting in a disprop ortionate ost that renders the atta k w orthless. The follo wing measures b elong to this ategory: • CAPTCHAs [19℄, whi h require solving puz- zles that are diult without h uman in terv en- tion; • k ey strengthening te hniques, whi h require a few seonds of omputation to deriv e a k ey from the passw ords; this idea rst app eared in the design of the UNIX system in the late '70s [10 ℄. A mo dern k ey strengthening algo- rithm, where the omputation length is ong- urable via the  hoie of a tunable parameter, is PBKDF2 [6 ℄. It is imp ortan t to note that these te hniques imp ose a trade-o to legitimate users: if an honest user has to pa y a ost c , the atta k er m ust pa y at most c · s , where s is the strength of the passw ord in terms of the n um b er of attempts needed to guess it. The measures obtained in this pap er an b e used to estimate osts and b enets of these systems, and th us to prop erly tune this c parameter. An alternativ e approa h blo  ks aoun ts after a giv en n um b er of failed attempts. This resp onse, ho w ev er, op ens the do or to denial of servie atta ks on user aoun ts and is ineetiv e unless the atta k is sp eially targeted to w ards a single user [13 ℄. Oine A tta ks In most ases, the authen tia- tion serv er do es not store passw ords in plain text. Instead, it k eeps an enrypted v ersion of them whi h is oneptually analogous to a hash: when a user attempts to log on, the passw ord they pro vide is enrypted and ompared to the stored v alue. In this w a y , ev en if an atta k er obtains the enrypted passw ords, these annot b e used righ t a w a y to log on to the system. T o mak e it ostly for the at- ta k er to guess the passw ord b y enrypting lots of passw ord andidates, k ey strengthening te hniques are again applied. A tta ks based on pre-omputing the enrypted v ersion of the most lik ely passw ords [11 , 12 ℄ are defeated with the simple te hnique of salting, also kno wn sine the early da ys of UNIX: that te hnique w orks b y app ending a random n um- b er to the passw ord b efore enrypting it, and then storing this n um b er along with the enrypted pass- w ord. Sine these te hniques are based on the idea of making guessing atta ks ostly , the passw ord strength that w e are measuring is also a k ey pa- rameter when ev aluating the resiliene of a pass- w ord system to oine atta ks. P assw ord Reo v ery W e measure passw ord strength b y taking in to aoun t attempts to break them with state of the art te hniques. The free passw ord reo v ery soft w are John the R ipp er 1 iden- ties passw ords b y  he king them against a large- sized ditionary , plus a xed set of mangling rules, 1 http://www.openwall.om/j ohn / 2 su h as app ending or prep ending digits to ditio- nary w ords. A ording to Brue S hneier's de- sription [16 ℄, A essData's proprietary P assw ord Reo v ery T o olkit omplemen ts this approa h with a phoneti pattern set generated via a Mark o v  hain routine to generate meaningless but pro- nouneable passw ords. In Setion 5, w e formal- ize a metho d based on the same idea and ev aluate its merits in reduing the sear h spae for ra king passw ords. Proativ e P assw ord Che king A proativ e passw ord  he k er is a system that fores (or ad- vises) the user to  ho ose omplex enough pass- w ords. The impat of these  he k ers on atual passw ord seurit y is debatable: as W u [20 ℄ notes,  [users are℄ v ery go o d at seleting passw ords that are just `go o d enough' to pass whatev er  he king is in plae. The MySpae so ial net w ork requires users to ha v e at least a non-alphab eti  harater in their passw ord; in a set of leak ed passw ords, 86% of the users om- plied with this requiremen t b y app ending a n um- b er at the end of their passw ord; for 20% of them that n um b er w as a 1 [14 ℄. F urthermore, a proa- tiv e passw ord  he k er ould enourage users to use non-ditionary passw ords that are related to their p ersonal life su h as dates, telephone n um b ers or li- ense plate n um b ers [1 ℄. F or a motiv ated atta k er, these passw ords are ev en easier to guess than di- tionary w ords. Moreo v er, a strong passw ord in the abstrat ould fore the user to write it do wn and lea v e it in a plae where an atta k er an easily nd it. F or example, man y emplo y ees hide pass- w ords under their mouse pads at their ompanies [17 ℄. In general, it seems that passw ord strength  he k ers atually inrease system seurit y only if they are seen b y users as a to ol that helps them and not just as an additional ho op they ha v e to jump through to get their job done. Existing passw ord  he k ers are based on quite naiv e metris [2, 21 ℄: they  he k on passw ord length, or resiliene to brute fore and ditio- nary based atta ks; still, they do not tak e in to a- oun t adv aned ra king te hniques. Our measure of strength as sear h spae size an b e used as the basis for more eetiv e passw ord  he k ers. Empirial Studies It is a w ell kno wn fat that man y users almost in v ariably  ho ose easy to guess passw ords; urren t empirial studies, ho w ev er, gen- erally fo us on a single kind of atta k and neglet to quan tify ho w strong the remaining share of pass- w ords are with resp et to more general atta ks. T o the b est of our kno wledge, no other w ork ev al- uates the strength of passw ords o v er their whole strength sp etrum and against all state-of-the-art te hniques. Analyses on ditionary atta ks rep ort a p eren t- age of brok en passw ords v arying b et w een 17% and 24% [10 , 7, 18 ℄. In Setion 4, b efore in v estigat- ing the remaining stronger passw ords, w e obtain results of similar magnitude, v arying with the t yp e and size of ditionary used. Some studies are based on a dataset of enrypted passw ords, and only rep ort on the ones that ha v e b een atually ra k ed [7 , 11 , 3 , 9℄; in omparison, w e had aess to the plain-text whi h ga v e us in- formation on the passw ords that w ould b e ompu- tationally impratial to break. In a 2007 study [4 ℄, Florenio and Herley ob- tained data ab out the passw ords of ab out 500,000 users. That w ork pro vides in teresting insigh ts ab out user habits, but only quan ties passw ord strength with a simple bit strength measure based on their length and on the use of upp erase, n u- meri, and non-alphan umeri  haraters; resiliene against adv aned passw ord-ra king te hniques is not tak en in to onsideration. 3 Our Dataset Our dataset on tains the unenrypted passw ords for the 9,317 registered users of an Italian instan t messaging serv er. Storing passw ords in plain text on the serv er is required b y authen tiation algo- rithms su h as CRAM-MD5 2 . User registration is free and no p oliy for passw ord strength is enfored: ev en the empt y passw ord is allo w ed. The absene of strength enforemen t allo ws us to in v estigate the b eha vior of users when  ho osing their passw ord in the absene of external requiremen ts. Users are free to  ho ose an y un used username when registering. A total of 269 users (2.89% of the total) use the same string as b oth username and passw ord. The single most eetiv e attempt 2 http://tools.ietf.org/htm l/r f21 95 3 Figure 1: P assw ord length distribution. to guess a giv en user's passw ord w ould therefore b e its o wn username. Some users share the same passw ord, and this results in 7,848 unique passw ords. While in some ases this ma y b e due to oinidenes and use of to o frequen t passw ords, other ases ma y b e the onse- quene of the same p eople registering under dier- en t usernames at the same serv er. The a v erage passw ord length is 7.86. Figure 1 sho ws the length distribution. Ev en though the full Unio de  harater set is usable for the pass- w ords, only 124 dieren t  haraters had b een used. F requenies of  haraters ha v e v ery unev en distri- butions (see table 1): while one  harater out of 11 is an ` a ', the most frequen t upp erase  harater (` A ') has a frequeny of appro ximately 1 in 500. In table 2, w e sho w the mat hing ratio of v arious simple regular expression. More than 50% of the passw ords on tain only lo w erase  haraters, and less than 7% on tain non-alphan umeri  haraters. Around 15% of them onsist of a string of lo w erase  haraters follo w ed b y a n umeri app endage. W e also analyzed a set of 33,671 leak ed MySpae passw ords [5, 14 ℄. Sine these passw ords ha v e b een obtained through a phishing atta k, they inlude those of less seurit y-onsious users who fell for the atta k. Moreo v er, MySpae requires users to insert non-alphab eti  haraters in their passw ords, and this imp oses an artiial impat on passw ords that users, left alone, w ould  ho ose. F or these rea- sons, w e onsider this dataset less represen tativ e of atual user passw ords than our primary one; w e ho w ev er use it in this w ork to orrob orate some of our ndings b y v alidating them on another dataset. Charater Coun t P eren tage a 6,681 9.12% e 4,520 6.17% o 4,484 6.12% i 4,388 5.99% r 3,628 4.95% n 3,310 4.52% l 3,095 4.23% s 2,895 3.95% t 2,853 3.90% 1 2,518 3.44%  2,367 3.23% m 2,137 2.92% 0 1,990 2.72% p 1,945 2.66% d 1,813 2.48% 2 1,692 2.31% u 1,640 2.40% b 1,624 2.22% 3 1,487 2.03% g 1,334 1.82% other 16,832 22.98% T able 1: Charater distribution. Expression Example Mat hes [a-z℄+ abdef 51.20% [A-Z℄+ ABCDEF 0.29% [A-Za-z℄+ AbCdEf 53.74% [0-9℄+ 123456 9.10% [a-zA-Z0-9℄+ A1b2C3 93.43% [a-z℄+[0-9℄+ ab123 14.51% [a-zA-Z℄+[0-9℄+ aB123 16.30% [0-9℄+[a-zA-Z℄+ 123aB 1.80% [0-9℄+[a-z℄+ 123ab 1.65% T able 2: P eren tage of passw ords mat hing v arious regular expressions. 4 4 Ditionary A tta k Ditionary atta k is the most eetiv e te hnique to guess the w eak est passw ords. W e ev aluated pass- w ord strength b y using the ditionaries a v ailable in the already men tioned John the R ipp er (JtR) pass- w ord reo v ery to ol. The extended ditionaries that w e used are a v ailable for paid do wnload from the program w ebsite 3 . 4.1 The Ditionaries The JtR ditionaries on tain w ords from 21 dier- en t h uman languages, plus a list of frequen tly used passw ords. F or some languages (lik e English and Italian), v arious ditionaries of dieren t sizes are a v ailable: the smaller ones on tain only the most frequen tly used w ords while the bigger ones also on tain more obsure w ords, the rationale b eing that more ommon w ords are more lik ely to b e  ho- sen as passw ords. T ak en together, all ditionaries aoun t for almost 4 million w ords. A bigger ditio- nary on taining more than 40 millions w ords is ob- tained using mangling rules that attempt to re- ate more omplex passw ords b y altering ditionary w ords, for example b y juxtap osition of ditionary w ords or b y app ending a n um b er at the end of the w ord. An often-advised te hnique to reate strong but easy to remem b er passw ords is to turn phrases in to passw ords b y extrating an aron ym, p ossibly also using puntuation. F or example, the phrase Alas, p o or Y ori k! I knew him, Horatio b eomes A,pY!Ikh,H. W e also ev aluated su h aron yms with a ditionary reated b y Kuo et al. [8℄ that w as put together b y sraping w ebsites displa ying mem- orable phrases, su h as itations and m usi lyris. 4.2 Exp erimen tal Results W e sim ulated ditionary atta ks with all the JtR ditionaries. T able 3 sho ws the results for the most represen tativ e instanes. The found olumn lists the p eren tage of pass- w ords that app ear in that ditionary; the guess probabilit y olumn reets the probabilit y that a random w ord from that ditionary mat hes a ran- dom passw ord: a rational atta k er w ould try a w ord from that ditionary only if the b enet of ra king 3 http://www.openwall.om/ word lis ts/ Ditionary Size F ound Guess prob. F requen t passw ords 3,114 7.25% 2 . 33 · 10 − 5 English 1 l 27,424 4.91% 1 . 79 · 10 − 6 English 2 l 296,809 9.42% 3 . 17 · 10 − 7 English 3 l 390,532 11.59% 2 . 97 · 10 − 7 English extra l 444,678 8.03% 1 . 81 · 10 − 7 Italian 1 l 63,041 3.71% 5 . 89 · 10 − 7 Italian 2 l 344,074 14.89% 4 . 33 · 10 − 7 All ab o v e 1,117,767 25.51% 2 . 28 · 10 − 7 All JtR ditionaries 3,917,193 25.94% 6 . 62 · 10 − 8 All JtR + mangling 40,532,676 30.12% 7 . 43 · 10 − 9 Mnemonis [8℄ 406,430 1.27% 3 . 12 · 10 − 8 T able 3: Ditionary atta ks. The l aron ym stands for all-lo w erase ditionaries: those on- taining upp erase letters are mat hed b y v ery few w ords in our dataset. The English 1, English 2 and English 3 ditionaries, lik e Italian 1 and Italian 2, are listed in gro wing size; ea h w ord b elonging to a smaller ditionary is also on tained in the bigger v ersions. the passw ord exeeds the in v erse of that probabilit y times the ost of the eort for trying that passw ord. The English extra ditionary has a sligh tly mis- leading name: it on tains w ords that don't ap- p ear in a regular ditionary but that users are lik ely to use, su h as prop er nouns, ommon mis- sp ellings or alterations of w ords. Man y of them are language-agnosti (e.g., Aldebaran) or ome from non-English languages (Mariela). As the serv er is in Italy , most users are Italian. The amoun t of English w ords found in passw ords is not partiularly surprising for those who kno w the tendeny that nativ es ha v e to w ards the hea vy use (and abuse) of English. An in teresting feature is the notieably higher densit y of ommon English w ords (those presen t in the small English 1 ditio- nary); that phenomenon is m u h less relev an t with resp et to Italian. W e think that this is aused b y the fat that most users kno w English as a seond language, and th us are less inlined to use an ob- sure w ord as their passw ord. This suggests that it migh t b e go o d pratie to use one's nativ e language to reate stronger passw ords. The most imp ortan t lesson dra wn from this data is the priniple of diminishing r eturns : the proba- bilit y of guessing a w ord sharply dereases as the ditionary gro ws. The 3,100-w ord ditionary of fre- 5 quen t passw ords ra ks 7% of those in our datasets; b y inreasing roughly 300 times the size of the di- tionary up to more than one million and inlud- ing all Italian and English w ords, the n um b er of ra k ed passw ords rises to 25%. When the n um b er of attempts gro ws b ey ond 40 millions b y inluding other languages and mangling, only 5% more of the passw ords are found. T o put it in another w a y , the probabilit y of guessing a giv en passw ord b y trying an elemen t of the frequen t passw ords ditionary is one in 43,000. On the other hand, after ha v- ing tried all the frequen t passw ords and the Italian and English ditionary , the probabilit y of guessing b y using another ditionary w ord is less than one in 500 million! Sine the guessing probabilit y de- reases so sharply , it is oneiv able that in man y ases it w on't b e w orth trying a bigger ditionary for the atta k er. W e also observ e that the mnemoni ditionary is quite ineetiv e. This ma y b e due to sev eral reasons: rst, few users atually use mnemonis for their passw ords; seond, they are atually m u h harder to break with ditionary atta ks. Moreo v er, w e are not able to asertain whether the habit of  ho osing English passw ords for Italian users w ould arry o v er to the use of mnemonis. Our data is, at the momen t, insuien t to p oin t to w ards one reason or the other. 5 Mark o v Chain-Based A tta k The fat that ditionaries fall short do es not mean that an atta k er w ould need to resort to an exhaus- tiv e brute-fore atta k: some passw ords are m u h more lik ely to b e  hosen than others. As seen in Setion 3, there is a v ery unev en distribution of  harater  hoie. Moreo v er, other regularities ex- ist: passw ords are usually made of pronouneable sub-strings and/or sequenes of k eys that are lose on the k eyb oard. In this setion, w e desrib e and v alidate an at- ta k based on Mark o v  hain-based mo deling of the frequenies of sub-strings with parametri length k , or k -graphs. This allo ws us to lab el andidate passw ords with v ariable probabilities, where strings that are lab eled as more lik ely are  he k ed rst. Some passw ord generating utilities atually use this kind of mo deling to obtain meaningless but pro- nouneable passw ords on the grounds that they're easier to remem b er, th us sariing some strength for usabilit y 4 . 5.1 The T e hnique W e base our formalization on the te hniques sho wn in [11℄, extending the mo del so that it applies to sub-strings of length 3 and more. This mo del rep- resen ts a passw ord  hoie as a sequene of random ev en ts: rst, the length of the passw ord is  hosen aording to a giv en probabilit y distribution; then, ea h  harater of the string gets extrated aord- ing to a onditional probabilit y dep ending on the previous k − 1  haraters. W e eno de the  harateristis of passw ords via t w o funtions, λ and ν . λ represen ts the length dis- tribution of passw ords so that, for example, λ (8) is the probabilit y that the passw ord has length 8. ν , instead, represen ts the onditional probabilit y of ea h k -graph with resp et to the orresp onding ( k − 1) -graph: ν ( c 1 . . . c k | c 1 . . . c k − 1 ) is the prob- abilit y that the  harater c k follo ws the sub-string c 1 . . . c k − 1 . F or k = 1 , ν ( c ) expresses the frequeny of c , that is, the probabilit y that a random  hara- ter in a passw ord oinides with c . By  ho osing k = 1 , th us fo using on  harater frequeny , the probabilit y P 1 ( α ) that our mo del will generate a string α (where its length is | α | and its i th  harater is α i ) is P 1 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i ) . T o deriv e P k with k ≥ 2 , w e will adopt the on v en tion that α i = ⊥ whenev er i ≤ 0 , where  ⊥  is a sp eial  harater not allo w ed to app ear in passw ords. F or example, w e write the probabil- it y that a passw ord starts with the  a   harater as ν (” ⊥ a ” | ” ⊥ ” ) ; the probabilit y that a  b  follo ws an initial  a  is instead ν (” ⊥ ab ” | ” ⊥ a ” ) . Giv en this, w e an formalize the digraph-based probabilit y P 2 as P 2 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − 1 α i | α i − 1 ) and, in general, 4 See for example gpw ( http://www.multiians.o rg/ thvv/tvvtools.html#gpw ), apg ( http://www.adel.nursat. kz/apg/ ), otp ( http://www.fourmilab.h/ one time / ). 6 P k ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − k +1 . . . α i | α i − k +1 . . . α i − 1 ) . 5.1.1 Estimating ν and λ It is ob viously imp ortan t that the probabilities en- o ded in the λ and ν funtions are represen tativ e of the real  harateristis of passw ords. W e do this b y adopting a set of strings as a training set and setting λ ( x ) as the fration of strings of length x . Denoting C as the  harater set and σ ( c 1 . . . c k ) as the n um b er of o urrenes of the sub-string c 1 . . . c k in the whole training set, w e set ν ( c 1 . . . c k | c 1 . . . c k − 1 ) = σ ( c 1 . . . c k ) P c ∈ C σ ( c 1 . . . c k − 1 c ) . In the absene of a represen tativ e training set of passw ords, a ditionary an b e used as in [11 ℄. As w e will exp erimen tally sho w in Setion 5.2, using passw ords themselv es as training set nally results in a b etter mo del. In this ase, when omputing P k ( α ) in our exp erimen ts, α itself m ust b e remo v ed from the training set and should not b e tak en in to aoun t when omputing the v alues of λ and σ . As men tioned in Setion 3, some users share the same passw ord. This migh t b e due to  hane and to the fat that those passw ords are quite trivial; another p ossibilit y is that they ome from the same user registering man y aoun ts and using the same passw ords for all of them. In the latter ase, an atta k er w ould not ha v e aess to the passw ord in a represen tativ e training set, and it w ould b e or- ret for our purp oses to remo v e all opies of the passw ord from the training set. Sine w e annot disriminate b et w een the t w o ases, w e will adopt a onserv ativ e approa h that ma y result in o v eres- timating the apabilities of the atta k er, therefore disarding only a single op y of the passw ord from the training set. A mo del with higher v alues of k should b e more aurate, but the pro ess of reating it is more dif- ult and exp ensiv e. In the extreme, a mo del with k exeeding the maxim um passw ord length w ould expliitly list the probabilit y of o urrene of ea h p ossible passw ord: this w ould require prohibitiv e training set size and storage apabilities (the re- quired spae is of the order of | C | k , where | C | is the size of the  harater set). With limited resoures, Algorithm 1 Expliit oun ting of sear h spae size. funtion size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : threshold if l = 0 then return 1 s ← 0 for all c ∈ C do p ← ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) if p ≥ t then s ← s + size ( c 2 . . . c k − 1 c, l − 1 , t · p ) return s funtion tot al_size ( t ) ⊲ t : threshold return P i size ( ⊥ . . . ⊥ , i, t · λ ( i )) when a k -graph do es not app ear in the training set due to under-sampling, then the probabilit y of a passw ord on taining that k -graph is omputed as 0. Su h a mo del w ould therefore nev er generate the required passw ord. 5.1.2 Computing The Sear h Spae Size So far, w e ha v e desrib ed a mo del that assigns prob- abilities to passw ords, with the aim of measuring ho w lik ely it is that a user w ould atually selet a giv en passw ord. A rational atta k er w ould use this mo del b y en umerating andidate passw ords start- ing with the most lik ely ones and on tin uing in de- reasing order of probabilit y . In order to measure the sear h spae size that su h a strategy w ould need to explore b efore nding a giv en passw ord, w e ha v e to nd out ho w man y unsuessful andidates w ould b e generated b efore the orret one: if the Mark o vian mo del lab els the probabilit y of a passw ord as p , its asso iated sear h spae size w ould therefore b e the n um b er of strings with probabilit y of o urrene higher than or equal to p. Expliit Coun ting The most ob vious system for omputing the sear h spae size up to a giv en threshold is to plainly en umerate it. In Algorithm 1, w e sho w ho w this an b e implemen ted with a simple reursiv e algorithm. 7 Algorithm 2 Appro ximation of sear h spae size. funtion appr_size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : log-threshold if l = 0 then return 1 s ← 0 for all c ∈ C do t ← t − ⌈− log b ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) ⌉ if t ≥ 0 then s ← s + a he_size  c 2 . . . c k − 1 c, l − 1 , t  return s funtion a he_size ( c 1 . . . c k − 1 , l , t ) ⊲ W e store results from appro x_size in a a he K if ( c 1 . . . c k − 1 , l , t ) / ∈ K then K ( c 1 . . . c k − 1 , l , t ) ← appr_size ( c 1 . . . c k − 1 , l , t ) return K ( c 1 . . . c k − 1 , l , t ) funtion tot al_size ( t ) ⊲ t : threshold return P i a he_size ( ⊥ . . . ⊥ , i, ⌊− log b t · λ ( i ) ⌋ ) Appro ximate Estimation As the sear h spae gro ws, the ab o v e approa h b eomes extremely ex- p ensiv e and should b e replaed with an appro x- imate estimation metho d [11℄. By xing a base b > 1 , an y probabilit y p an b e appro ximate as b − l for an in teger v alue l ≥ 0 . Cho osing l = ⌊− log b p ⌋ appro ximates p b y exess, while l = ⌈− log b p ⌉ ap- pro ximates b y defet. T o help in tuition, l an b e seen as a disrete passw ord strength v alue, whi h an b e omputed as the sum of strengths for ea h k - graph on tained in the passw ord. V alues of b loser to 1 result in a ner gran ularit y for our appro xima- tion, at the ost of an inrease in omputation. By adopting su h an alteration, the omputation gets a big sp eedup b y memoizing the parameters and results of ea h appr o x_size all, and return- ing them when the funtion is alled again with the same parameters. This ouldn't b e done with the former v ersion, sine the t threshold parameter of the size funtion is a oating p oin t n um b er whi h is v ery lik ely to b e dieren t at ea h funtion all. Sine w e are aiming for a onserv ativ e estimate for the sear h spae that appro ximates b y exess the apabilities of the atta k er, w e use appro xima- 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 Search space size 1 0 - 8 1 0 - 7 1 0 - 6 1 0 - 5 1 0 - 4 1 0 - 3 Threshold k = 1 k = 2 k = 3 k = 4 k = 5 Figure 2: Sear h spae size v ersus probabilit y threshold for the k -graph Mark o vian mo del. The plotted urv es sho w the result of the exat om- putation of Algorithm 1, while the p oin ts mark ed b y rosses are the result of the appro ximation of Algorithm 2. tions to obtain a lo w er limit for the sear h spae size. T o do this, w e appro ximate the starting threshold b y defet and all the ν probabilities b y exess. The result of these mo diations is the appro xi- mate funtion dened in Algorithm 2. 5.2 Exp erimen tal results This setion desrib es the results of the exp eri- men ts desrib ed ab o v e when applied to our pass- w ord dataset. Unless otherwise sp eied, w e use the passw ords themselv es as training set. Sear h Spae Size V ersus Probabilit y Threshold In Figure 2 w e sho w the size of sear h spae on taining strings lab eled with a probabilit y greater or equal to a giv en probabilit y threshold. This is omputed for dieren t v alues of k and using b oth the exat oun t and the appro ximate measure from Algorithm 2. W e used a parameter b = 1 . 01 ; with that  hoie, w e obtained a relativ e error of the order of 5% (not notieable in the gure due to the log-log sale). By  ho osing 1 ≤ k ≤ 3 (i.e., basing the mo del on sub-strings of lengths 1 to 3), the probabilities of strings generated b y the mo del roughly follo w a 8 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 3: Sear h spae size v ersus fration of guessed passw ords. p o w er la w. It is in teresting to note that this mirrors frequenies of w ords in h uman natural languages, whi h ob ey the p o w er la w as w ell [22 ℄. F or k ≥ 4 , the n um b er of andidate strings gro ws denitely slo w er as the probabilit y threshold inreases; this is due to the fat that ea h k -graph is represen ted b y a lo w n um b er of strings in the training set, and the n um b er of strings that an b e obtained b y om- bining k -graphs that are presen t in the dataset is limited. W e onjeture that, with a bigger training set, w e w ould obtain a p o w er-la w distribution also in this ase. In the follo wing, w e use the appro ximate ap- proa h to estimate the sear h spae size where the exat v alue b eomes either impratial or imp ossi- ble to ompute. W e ompute data p oin ts for ea h p = 10 − i threshold ( i b eing an in teger) and in ter- p olate with the p o w er la w that onnets the p oin ts (a straigh t line in the log-log plot). P assw ord Strength In Figure 3, w e plot the fration of passw ords guessed as a funtion of the sear h spae size. With higher v alues of k , w e obtain b etter results for the w eak er passw ords due to the more preise mo deling obtained in this ase. Ho w ev er, the pass- w ords that inlude k -graphs not represen ted in the training set annot b e guessed. Metho ds based on smaller k v alues b eome more eetiv e b eause they an generalize some more. In pratie, the opti- 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 4: Sear h spae size v ersus fration of guessed passw ords on the MySpae dataset. mal strategy dep ends on the resoures of the at- ta k er, measured b y the n um b er of attempts that an b e tried. The diminishing returns eet that w e diso v- ered for ditionary atta ks also applies to this te h- nique: ev en when  ho osing the b est v alue of k for ea h ase, around 100,000 andidates need to b e tried in order to guess 20% of the passw ords ( k = 5 ); this n um b er rises to roughly 1.1 billions andidates for a suess rate of 40% ( k = 3 ); the sear h spae needed to break 90% of the passw ords gro ws to appro ximately 3 · 10 17 ( k = 2) . With su h a h uge v ariane in the size of the sear h spae, it seems that no reasonable atta k based on passw ord guessing w ould sueed in guessing all passw ords  exepting those ases where users are artiially fored to limit passw ord strength, for example b y imp osing a maxim um length. MySpae P assw ords In Figure 4, w e rep eat our measuremen ts using MySpae passw ords in the plae of our main dataset b oth as training set and as guessed passw ords. W e obtain qualitativ ely sim- ilar results  in partiular, higher v alues of k are more appropriate as training sets for w eak er pass- w ords, and the diminishing returns priniple holds. F rom a quan titativ e p oin t of view, the sear h spae for w eak passw ord is bigger, while it is smaller for stronger passw ords. W e think that this is mainly due to the partiularities of the dataset: w eak pass- 9 1 0 4 1 0 8 1 0 1 2 1 0 1 6 1 0 2 0 1 0 2 4 1 0 2 8 1 0 3 2 1 0 3 6 1 0 4 0 1 0 4 4 1 0 4 8 Search space size 1 0 - 3 1 0 - 2 1 0 - 1 1 0 0 Fraction of missed passwords Brute force k = 1 k = 2 Figure 5: Comparison of brute fore and Mark o v- mo del based atta ks. w ords are made stronger b y the requiremen t of non- alphab eti  haraters; strong passw ords reated b y seurit y-onsious users, on the other hand, are under-represen ted sine su h users are not lik ely to fall vitim to a phishing atta k. Brute F ore In Figure 5, w e ompare the brute fore approa h with our Mark o vian mo deling. The brute fore approa h starts b y trying the empt y passw ord, then pro eeds with en umerating all p os- sible passw ords with inreasing length. The full Unio de  harater set urren tly has more than 99,000  haraters 5 , but man y of them are v ery rare and denitely unlik ely in a passw ord; to aoun t for this, w e again to ok a onserv ativ e approa h o v er- estimating the atta k er apabilities, and to ok in to aoun t only the 124  haraters that w e ha v e found in our dataset. In all but the most extreme ases, the Mark o vian mo del pro v es more eien t b y orders of magnitude. It is not b efore 10 40 andidates (and ha ving found 99.7% of the passw ords) that a brute fore approa h b eomes more eetiv e than the Mark o vian mo del with k = 1 ( harater frequenies). This n um- b er is w ell b ey ond the apabilities of an y realis- ti atta k er: to put this in on text, a luster of a thousand 10 GHz ma hines w ould need more than 3 · 10 19 y ears to rea h that n um b er of iterations, ev en assuming that they are able to try a passw ord 5 http://www.uniode.org/p ress /pr - u d5.0 .htm l 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 1 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 2 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 3 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 4 Figure 6: Satter plots of passw ord length (Y axis) v ersus strength (asso iated sear h spae size on X axis). for ea h lo  k yle. Strength and P assw ord Length In Figure 6, w e highligh t the relationship b et w een a passw ord length and its strength. As the graphs sho w, the assumption that longer passw ords are stronger an only b e regarded as a rule of th um b: a short pass- w ord on taining infrequen t  haraters and/or se- quenes thereof an b e atually stronger than a no- tieably longer one. The orrelation b et w een length and strength b eomes w eak er as the k parameter gro ws: long but w eak passw ords ma y b e based on preditable long patterns that are less eien tly predited b y mo dels based on lo w er k v alues. F or example, it is quite lik ely that the  abd  sequene is follo w ed b y a  e ; a mo del based on digraphs, though, annot apture this and an only mo del whi h  harater is more lik ely to follo w a  d . T raining Sets Figure 7 illustrates ho w the  hoie of training sets aets the atta k p erformane. The training sets used are our sets of usernames and passw ords, the MySpae leak ed passw ords, the JtR ommon passw ord ditionary , and Italian and En- glish ditionaries. The most eetiv e training set is the real pass- w ord set. The ommon passw ords ditionary from JtR is more represen tativ e of real passw ords 10 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords Passwords MySpace Usernames JtR common English Italian Figure 7: Comparison of v arious training sets for guessing passw ords in our dataset ( k = 2) . than standard ditionaries, sine it on tains om- binations of  haraters, su h as puntuation and digits, that don't app ear in standard ditionaries. Still, it app ears that a v erage passw ords do not losely resem ble the most ommon ones. The ase of MySpae passw ords as training set is in teresting: they are lose to the p erformane of our passw ord dataset for strong passw ords, but they do not represen t w eak ones w ell. W e b eliev e this is due to the o v er-represen tation of non-alphab eti  har- aters, whi h are required to b e presen t in MySpae passw ords. The dierene in o v erage on strong passw ords (around 5% with equiv alen t sear h spae size) an also b e attributed to this feature, as w ell as to the follo wing fators: • Dierene in omputer literay: the MySpae sample on tains only the vitims of a phishing atta k; • Dierene in language: MySpae users are dis- tributed w orldwide. If a represen tativ e training set of real passw ords is not a v ailable to the atta k er, usernames are b y far the most eetiv e training set. It app ears that, when users are ask ed to pro vide a username and a passw ord, they emplo y similar riteria. This is quite surprising sine the t w o strings need to sat- isfy v ery dieren t, and arguably oniting, rite- ria: go o d usernames are easily memorable, while a 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords Usernames, k = 2 Passwords, k = 2 Usernames, k = 3 Passwords, k = 3 Figure 8: Comparison of omplexit y b et w een pass- w ords and usernames. strong passw ord has to b e as diult to guess as p ossible. Usernames The former result suggests a onsid- eration: usernames and passw ords are  hosen si- m ultaneously , when registering a new aoun t. A user w an ts b oth strings to b e memorable, sine the t w o are needed in order to log on suessfully . Ho w- ev er, while there is no inen tiv e in  ho osing omplex usernames, a seurit y onsious user will ommit some eort to mak e his passw ord more omplex. The dierene in omplexit y b et w een usernames and passw ords is therefore a w a y to measure the eort that users willingly put in making their pass- w ords more omplex: while usernames an b e v ery long or diult to guess, this is not lik ely to happ en as the result of a onsious attempt to do so. In Figure 8, w e ompare the sear h spae size asso iated to usernames and passw ords. Mat hing what w e ha v e done with passw ords, the training set used to guess a giv en username onsists of all the usernames exept the one under srutin y . It turns out that the eort that users put in reating omplex passw ords is measurable, but it is o v erall quite w eak: giv en a  hoie for k and a sear h spae size, the p eren tage of ra k ed usernames nev er exeeds the ra k ed passw ords b y more than 15%. 11 6 Com bined Strategy Our results onrm that no single strategy or te h- nique is more eetiv e in reduing the sear h spae: ditionaries are most eetiv e in diso v ering the w eak est passw ords; the o v erage (fration of pass- w ords that are in the ditionary) gro ws as the di- tionary size gro ws, but this en tails a loss in prei- sion (fration of ditionary items that are atual passw ords). The Mark o v- hain based te hnique should b e used when ditionaries are exhausted. Higher v alues of k obtain b etter results at rst, but after a n um b er of attempts they b eome quite in- eetiv e. No single strategy is the b est one for all ases; this, in fat, v alidates the approa h tak en b y pass- w ord reo v ery systems that adopt bigger and big- ger ditionaries in asade, and resort afterw ards to Mark o v-based te hniques. In this setion, w e sum- marize our results b y presen ting the results that an atta k er w ould b e able to obtain b y using su h a te hnique. Consisten tly with our approa h of estimating the apabilities of the atta k er b y exess in the fae of unertain t y , w e assume that the atta k er has aess to a passw ord training set whi h is as eetiv e as the one w e obtain from the lear text. F urthermore, w e also assume that the atta k er is able to predit the eetiv eness of te hniques that w e measured in Setions 4 and 5. Based on this kno wledge, using the training set as a ditionary , the strategy for the ditionary-based rst part of the atta k is as follo ws: 1. T ry the username; 2. T ry the ommon passw ords ditionary; 3. T ry all passw ords in the training set; 4. T ry the English 1 ditionary; 5. T ry the Italian 1 and 2 ditionaries; 6. T ry the English 2, 3 and extra ditionaries; 7. T ry all remaining JtR ditionaries; 8. T ry mangling rules. If this approa h is not suien t, one should resort to the Mark o vian mo del. Figure 9: Sear h spae size for passw ords that are not found in an y ditionary . In the inner frame, detail on the rst iterations. In Figure 9, w e sho w the sear h spae for the passw ords that ha v e not b een diso v ered within an y ditionary . With resp et to gure 3, there is a sharp derease in the suess rate un til the sear h spae size rea hes appro ximately 10 8 . In partiular, te hniques with k = 5 and k = 4 are unsuessful to break more than, resp etiv ely , roughly 1% and 4% of the passw ords. This mat hes with the in tu- ition that ditionary-based atta ks are more useful against the less omplex passw ords. Based on the data represen ted in Figure 9, an eien t strategy for the atta k w ould b e as follo ws: 1. T ry 500,000 andidates with the mo del based on k = 5 ; 2. T ry 7,000,000 andidates with k = 4 ; 3. T ry 700,000,000 andidates with k = 3 ; 4. T ry 7 · 10 16 andidates with k = 2 ; 5. Con tin ue with k = 1 . In T able 4, w e summarize the sear h spae size and p eren tage of ra k ed passw ords for ea h of these steps. This is the answ er to our original ques- tion: ho w man y attempts an atta k er w ould need in order to guess a giv en p eren tage of the passw ords. By in tegrating this with system-sp ei kno wledge su h as the omputational ost needed to p erform a single guess and the amoun t of resoures that the 12 Step #attempts Cra k ed Username 1 2.88% Common passw ords 3,115 9.95% T raining set 10,431 28.83% English 1 36,574 30.51% Italian 1 98,511 32.25% Italian 2 373,834 36.31% English 2 632,613 37.18% English 3 722,215 37.69% English extra 1,123,841 40.07% JtR - all ditionaries 3,923,660 41.14% Mangling 40,538,747 44.26% Mark o v  hain - k = 5 41,070,093 45.05% Mark o v  hain - k = 4 48,051,199 46.76% Mark o v  hain - k = 3 ~750,000,000 58.10% Mark o v  hain - k = 2 ~ 7 · 10 16 91.06% Mark o v  hain - k = 1 ~ 10 40 99.71% T able 4: Cum ulativ e n um b er of attempts and of guessed passw ords for the m ulti-step approa h. Candidates that w ould b e  he k ed in more than one ditionary are oun ted only one. F or the Mark o v  hain te hnique with k ≤ 3 , the sear h spae has not b een generated expliitly and its size has b een appro ximated with Algorithm 2. atta k er has aess to, it is p ossible to estimate the p eren tage of passw ords that are vulnerable to a giv en atta k. 7 Conlusion As the bibliograph y of this w ork witnesses, the rst studies on passw ord ra king date ba k to almost 30 y ears ago. Still, the te hniques that are used in state of the art passw ord-ra king appliations are quite simple: deades of resear h suggest that it is p ossible to do b etter than applying simple Mark o v  hain-based mo deling te hniques. The results of our measuremen t study ma y pro- vide an explanation as to wh y not m u h has b een done in this diretion: the diminishing returns ef- fet implies that, ev en if the size of the sear h spae dereases b y orders of magnitude, the p eren tage of passw ords that an atta k er w ould b e able to ra k in a giv en n um b er of attempt w ould inrease only b y a non-impressiv e p eren tage. In addition, it is lik ely that an inno v ativ e strategy for exploring the sear h spae w ould impro v e o v er the state of the art only for a giv en in terv al of sear h spae sizes; the lo w-ost/high-rew ard part of the sear h spae is already easily o v ered b y ditionaries of frequen t passw ords. When su h an atta k pro v es ineetiv e, an atta k er ould  hange target to nd an easier prey , or use other means of atta k whi h are not based on the passw ord strength, su h as so ial en- gineering, phishing, or exploitation of vulnerabili- ties in soft w are or in the proto ol: as the energies instilled in to an unsuessful atta k gro w, the at- ta k is more and more lik ely to b e unsuessful in the future as w ell. W e fo used on the strength of passw ords  hosen b y users in the absene of passw ord strength en- foremen t. As p oin ted out in Setion 2, it is debat- able that systems enforing passw ord omplexit y atually inrease seurit y: they ma y instead lead users to irum v en t the enforemen t te hniques b y adopting inseure b eha vior. T o assess this, measur- ing passw ord omplexit y with and without enfore- men t should b e oupled with an analysis of user b eha vior. Another in teresting question y et to b e addressed regards the orrelation b et w een passw ord strength and the domain they are related to. In partiu- lar, ho w will the passw ord strength of a user v ary 13 if getting an aoun t ompromised w ould result in a notieable loss? In [4℄, some evidene that users atually  ho ose b etter passw ords for aoun ts re- lated to v aluable assets (e.g., P a yP al) is rep orted. Unfortunately , the bit-strength measure adopted is quite simple. F urther in v estigations w ould b e re- quired to obtain atual gures in terms of atta k er osts in order to break an aoun t. 8 A  kno wledgemen ts The authors w ould lik e to thank Sebastian P orst and Roger Grimes for ha ving shared the set of MySpae passw ords, and Cyn thia Kuo, Sasha Ro- manosky , and Lorrie F. Cranor for ha ving shared their mnemonis ditionary . Referenes [1℄ A. A dams and M. A. Sasse. Users are not the enem y . Commun. A CM , 42(12):4046, Deem- b er 1999. [2℄ M. Bishop. Impro ving system seurit y via proativ e passw ord  he king. Computers & Se- urity , 14(3):233249, 1995. [3℄ J. A. Cazier and D. B. Medlin. P assw ord seurit y: An empirial in v estigation in to e- ommere passw ords and their ra k times. In- formation Se urity Journal: A Glob al Persp e - tive , 15(6):4555, 2006. [4℄ D. Florenio and C. Herley . A large-sale study of w eb passw ord habits. In WWW '07: Pr o  e e dings of the 16th international  onfer- en e on W orld Wide W eb , pages 657666, New Y ork, NY, USA, 2007. A CM. [5℄ R. A. Grimes. MySpae passw ord exploit: Crun hing the n um b ers (and letters). In- foW orld online artile, No v em b er 2006. [6℄ B. Kaliski. RF C 2898 - PK CS #5: P assw ord- based ryptograph y sp eiation v ersion 2.0. T e hnial rep ort, IETF, Septem b er 2000. [7℄ D. V. Klein. F oiling the ra k er: A surv ey of, and impro v emen ts to, passw ord seurit y . In Pr o  e e dings of the 2nd USENIX UNIX Se u- rity W orkshop , 1990. [8℄ C. Kuo, S. Romanosky , and L. F. Cranor. Hu- man seletion of mnemoni phrase-based pass- w ords. In SOUPS '06: Pr o  e e dings of the se - ond symp osium on Usable privay and se u- rity , pages 6778, New Y ork, NY, USA, 2006. A CM Press. [9℄ S. Mare hal. A dv anes in passw ord ra k- ing. Journal in Computer Vir olo gy , 4(1):73 81, F ebruary 2008. [10℄ R. Morris and K. Thompson. P assw ord se- urit y: a ase history . Commun. A CM , 22(11):594597, No v em b er 1979. [11℄ A. Nara y anan and V. Shmatik o v. F ast di- tionary atta ks on passw ords using time-spae tradeo. In CCS '05: Pr o  e e dings of the 12th A CM  onfer en e on Computer and  ommu- ni ations se urity , pages 364372, New Y ork, NY, USA, 2005. A CM Press. [12℄ P . Oe hslin. Making a faster ryptanalyti time-memory trade-o. In A dvan es in Cryp- tolo gy - CR YPTO 2003 , pages 617630, 2003. [13℄ B. Pink as and T. Sander. Seuring pass- w ords against ditionary atta ks. In CCS '02: Pr o  e e dings of the 9th A CM  onfer en e on Computer and  ommuni ations se urity , pages 161170, New Y ork, NY, USA, 2002. A CM Press. [14℄ S. P orst. A brief analysis of 40,000 leak ed MySpae passw ords. Blog p ost at http://www.the- in te rwe b. o m/ serendipity/index .p hp? /a r hiv es / 94- A- brief- analysi s- of- 40, 000- leaked- MySpae - p ass wo rd s.h tm l , No v em b er 2007. [15℄ S. Riley . P assw ord seurit y: What users kno w and what they atually do. Usability News , 8(1), F ebruary 2006. [16℄ B. S hneier. S hneier on seurit y: Cho os- ing seure passw ords. Blog p ost at http://www.shnei er .o m/ bl og/ ar h ive s/ 2007/01/hoosing\ _s eu re .h tml , Jan uary 2007. [17℄ R. E. Smith. The Str ong Passwor d Dilemma ,  hapter 6. A ddison-W esley , 2002. 14 [18℄ E. H. Spaord. Observing reusable passw ord  hoies. In In Pr o  e e dings of the 3r d Se urity Symp osium. Usenix , pages 299312, 1992. [19℄ L. v on Ahn, M. Blum, and J. Langford. T elling h umans and omputers apart automatially . Commun. A CM , 47(2):5660, F ebruary 2004. [20℄ T. W u. A real-w orld analysis of Kerb eros pass- w ord seurit y . In Pr o  e e dings of 1999 Network and Distribute d System Se urity Symp osium , F ebruary 1999. [21℄ J. J. Y an. A note on proativ e passw ord  he k- ing. In NSPW '01: Pr o  e e dings of the 2001 workshop on New se urity p ar adigms , pages 127135, New Y ork, NY, USA, 2001. A CM. [22℄ G. K. Zipf. Human Behaviour and the Prini- ple of L e ast Eort: an Intr o dution to Human E olo gy . A ddison-W esley , 1949. 15

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment