A bounded confidence approach to understanding user participation in peer production systems

Commons-based peer production does seem to rest upon a paradox. Although users produce all contents, at the same time participation is commonly on a voluntary basis, and largely incentivized by achievement of project's goals. This means that users ha…

Authors: Giovanni Luca Ciampaglia

A bounded confidence approach to understanding user participation in   peer production systems
A Bounded Confidence Approach to Understanding User P articipation in Peer Production Systems Giov anni Luca Ciampaglia ∗ ciampagg@usi.ch September 29, 2018 Abstract Commons-based peer production does seem to rest upon a paradox. Although users produce all contents, at the same time participation is commonly on a volun- tary basis, and largely incentivized by achiev ement of project’ s goals. This means that users ha v e to coordinate their actions and goals, in order to keep themselv es from leaving. While this situation is easily explainable for small groups of highly committed, like-minded individuals, little is known about lar ge-scale, heteroge- neous projects, such as W ikipedia. In this contribution we present a model of peer production in a large online community . The model features a dynamic population of bounded confidence users, and an endogenous process of user departure. Using global sensitivity anal- ysis, we identify the most important parameters affecting the lifespan of user par- ticipation. W e find that the model presents two distinct regimes, and that the shift between them is gov erned by the bounded confidence parameter . For low values of this parameter, users depart almost immediately . For high v alues, ho wev er, the model produces a bimodal distribution of user lifespan. These results suggest that user participation to online communities could be explained in terms of group con- sensus, and provide a novel connection between models of opinion dynamics and commons-based peer production. 1 Intr oduction In the past decade mass collaboration platforms have become common in sev eral pro- duction conte xts. The term commons-based peer pr oduction has been coined to refer to a broad range of collaborati ve systems, such as those used for producing software, sharing digital content, and organizing large kno wledge repositories, howe ver , seem to be based upon a paradox. In wikis, there is a link between quality and cooperation [30], but, at the same time, contribution is v oluntary , based on non-monetary incentives ∗ Accepted to Socinfo 2011. The final publication is av ailable at www .springerlink.com 1 [23, 26]. For small teams, this might not be a problem. In large scale wikis, where lo w access barriers are necessary to attract v ast masses of contributors [8], and where expert users play a crucial role in maintenance and go vernance [2], user retention becomes in- stead crucial [10]. An established fact about participation to online groups is the prefer ential behavior of users, that is, a newcomer’ s long-term participation can be predicted by the outcome of his or her early interactions [1, 21]. This could be explained in terms of Socialization theory [6], as users assess the willingness of the community to accept them and vice versa. It is also true, howe ver , that quality assessment of the produced contents, and in particular comparison of the objectiv es of an individual with those of the community , is important in determining user participation [17]. This could be explained as a form of day-to-day coordination or group consensus taking place among editors [15]. In this paper we study user participation as a collective social phenomenon [4]. Other models of peer -production have been proposed already , for example for social information filtering platforms [13]. Here, we draw specifically from the modeling work on models of social influence under bounded confidence [9, 12]. Let us consider a community of users engaged in editing a collection of pages, e.g. Wikipedia. Pages are denoted by a certain number of features upon which users can find themselves in agreement or not. For example, let us consider the writing style of pages. Users try to modify pages according to their objecti ves, i.e. using their own style. At the same time, by interacting with contents, users can be also influenced by the style of other users. This reciprocal influence, ho wever , happens only to a certain extent, that is, only when user and page (that is, their styles) are similar enough. V andals, to illustrate with the same example, might not be interested in learning the encyclopedic writing style. In the context of social psychology this phenomenon is known as bounded confidence , and is regarded as a general feature of human communication within groups that try to reach consensus [12]. It can be also seen as a form of herding in that people are influenced by the social conte xt the y are in [22]. The population of users in our model is dynamic, with user departure determined endogenously by the social influence process. Although others hav e already studied Deffuant’ s model to a dynamic population [3], here we explicitly link the process of social influence to user participation. W e implemented these ideas in an agent-based model of a peer production system. In this model, several factors affect the behavior of agents, such as user activity , content popularity , and community growth. T o understand what factors are truly important for the resulting dynamics of user participation, we performed a factor screening using global sensitivity analysis. 1.1 Related work The subject of user participation in mass collaboration systems has been already touched by se veral authors, for example on social networking sites [16], and knowledge sharing platforms [31]. A “momentum” law has been proposed for the distribution of user life edits of inactiv e users [29]. The distribution of user account lifespans has been shown to decay with a heavy tail, and a power-la w model has been proposed after this obser- 2 vation [11]. Empirical data from W ikipedia, howe ver , seem to support a super-position of dif ferent regimes [7]; a feature of the model we present here is indeed a bimodal dis- tribution of user lifespans. In the conte xt of wikis and other free open source initiatives some authors hav e used surviv al analysis to outline the diffences between different communities, [20] but this modeling technique is not suited to understand the connec- tion between social influence, group coordination, and user retention. W e adv ocate the need to explicitly model such processes e xplicitly . The paper is organized as follows: in Sec. 2 we introduce our model of peer produc- tion; in Sec. 3 we briefly describe global sensitivity analysis and Gaussian Processes, the two statistical techniques we used for the factor screening study; in Sec. 4 we present our main results and we discuss them in Sec. 5. 2 An agent-based model of commons-based peer pr o- duction In this section we introduce our model of peer production. While we mak e e xplicit use of the terminology of wiki platforms (e.g. “users” who “edit pages”) we stress that ours is a general model of consensus building in a dynamic bipartite population, and not merely a description of a wiki platform. W e also stress that in our model the state of agents may not necessarily represent an opinion in the classic sense of other studies of opinion dynamics, i.e. extremes of the spectrum do not necessarily denote – say – political extremism, nor we speak of “moderates” to identify the center of the opinion space. T o keep things simple, we consider only the unidimensional case, i.e. the state of an agent is a scalar number in the interval [0 , 1] . W e denote with x ( t ) the state of a generic user at time t and with y ( t ) the state of a generic page. The interaction rule between a user and a page captures the dynamics of social influence. Let us imagine that at time t a user edits a page. Let µ ∈ [0 , 1 / 2 ] be the speed (or uncertainty) parameter and ε ∈ [0 , 1] the confidence [18]. If | x ( t ) − y ( t ) | < ε then: x ( t ) ← x ( t ) + µ ( y ( t ) − x ( t )) (1) y ( t ) ← y ( t ) + µ ( x ( t ) − y ( t )) (2) else, if | x ( t ) − y ( t ) | ≥ ε , we allo w only Eq. (2) to take place with probability r . This addition to the bounded confidence averaging rule reflects the f act that, in peer production systems, users often deal with content the y do not agree with without being influence by it, as when a vandalized page is rev erted to a previous, non-vandalized revision (also kno wn as rollback). Different pages can reflect different topics and hence receiv e attention from users based on their popularity . W e employ a simple reinforcement mechanism to model this. Let c p ≥ 0 be a constant. If m t is the number of edits a page has receiv ed up to time t , then the probability of it being selected at that time will proportional to m t + c p . When c p → ∞ , pages will be chosen for editing with uniform distribution, regardless 3 of the number of edits they have received. Hence, we can study the impact of of content popularity in user participation by setting c p to a small or large value. Of course users do not always choose to edit an existing page. Sometime, a user can decide to create a new page. W e model this by considering a rate of new page creations ρ p . Whene ver a new page is created, its state y is equal to the state x of creator . Creators are chosen at random among existing users. In order to model user participation, the population of users is dynamic. First, we consider an input rate of new users ρ u , whose state is chosen at random within the interval [0 , 1] . Second, we consider a inhomogeneous departure rate that depends on the experience of users. Let us consider a generic user at time t and let us denote with n t the number of edits he (or she) did up to t , and with s t the number of these edits that resulted in the application of Eq. (1). Let c s ≥ 0 be a constant and r ( t ) be the ratio r ( t ) = s t + c s n t + c s (3) The rate of departure λ d ( t ) is then defined as: λ d ( t ) = r ( t ) τ 0 + 1 − r ( t ) τ 1 (4) with τ 0  τ 1 time scale parameters. Depending on the value of r ( t ) , the e xpected lifetime h τ i will interpolate between two values: h τ i = τ 0 (long lifetime) for r ( t ) = 1 , h τ i = τ 1 if r ( t ) = 0 (short lifetime). If c s → ∞ , we recover a homogeneous process with rate τ − 1 0 , so we can set c s to control ho w sensiti ve the departure rate is to unsuccessful interactions. 3 Evaluation Methods 3.1 Computer Code Emulation via Gaussian Processes Although we can perform the statistical ev aluation of our peer production model us- ing directly the computer simulator , this approach is not desirable, as ev aluation of the computer code can be quite time consuming. W e rely instead on emulation of the computer code output. W e use a Gaussian Process (GP) as a surrogate model of the av erage lifetime h τ i of users in our peer production system. Gaussian processes (or Gaussian Random Functions, GRF) are a supervised learning technique used for func- tional approximation of smooth surfaces and for prediction purposes: see [25] for the application of GP to computer code ev aluation. Giv en input sites Θ obs = ( θ 1 , θ 2 , . . . , θ N ) we can e valuate our model as specified abov e, and obtain observations of the av erage user lifetime T obs = ( τ 1 , τ 2 , . . . , τ N ) . Based on these observ ations, we wish to predict the v alue of τ at an untested input site θ , i.e. τ ( θ ) . With a GP , this v alue is ˆ τ ( θ ) = E [ τ ( θ ) | Θ obs ] ; the uncertainty in the prediction, that is, V ar [ ˆ τ ( θ )] , is equal to V ar [ τ ( θ ) | Θ obs ] . W ith it we can compute a confidence interval that characterizes the uncertainty of the prediction of τ based on training data (Θ obs , T obs ) . 4 There are several strategies for selecting the input sites Θ obs at which we will run our computer simulator . Here we choose to employ a uniform, space-filling design generated via Latin Hypercube Sampling (LHS) because it yields better error bounds than those produced with uniform random sampling [19]. The space-filling require- ment is attained using a maximin design. A maximin design is any collection of points Θ that maximizes the minimum distance between points: max Θ min i 1 / 2 the dynamics of consensus does not change noticeably . This should apply also to the dynamics of user participation in our model. W e ran some simulations of the average lifetime, and found confirmation to this intuition. W e thus restricted ε to the interval (0 , 1 / 2 ) . 4.2 T ransient T ransient duration T 0 was determined empirically: we plotted the daily number of users N u ( d ; θ ) , d = 1 , 2 , . . . , for v arious values of the parameters θ and chose T 0 1 These statistics are freely av ailable on http://stats.wikimedia.org. 7 as the time after which all curves look stationary . Figure 1a reports the results of this ex ercise. In the figure, the shaded region corresponds to the transient interval (0 , T 0 ) . The value of T 0 is 730 days. The values of θ were taken from a maximin LHD with 50 points. Each curve is scaled by its a verage value N u ( θ ) computed over the interval d ∈ [731 , 1095] . The yello w solid line is a B-spline fit of 50 e venly spaced observations of the expected scaled number of users N u / N u , and serves as a guide for the e ye. During the transient phase we did not record any data, so that the estimation of τ , on which the sensitivity our analysis is based, did not reflect the dynamics of opinion formation during the transient. 4.3 F actor screening via global sensitivity analysis W e sampled a maximin Latin Hypercube Design (LHD) with 50 points using the inter- vals listed in T ab . 1. T o sample a decent maximin design, we generated 10 4 hypercubes at random and selected the one that maximized Eq. (3.1). W e computed the average user lifetime h τ ( θ ) i by running 10 replications for any θ and averaging the values obtained. W e first plotted the values of the response variable τ versus each input parameter to check visually for an y linear trend. Scatter plots are sho wn in Fig. 2. A multiple linear re gression gav e a coefficient of determination R 2 = 0 . 83 . Ho wever , no clear trend emerges from the plots for all parameters except for the confidence ε and the long lifetime τ 0 . For the latter , something similar to a linear trend can be seen, whereas for the other the relationship looks more of sigmoidal type. W e tried fitting a sigmoid function to τ as a function of ε . The result of a K-S test (p-value < 3 . 5 × 10 − 4 ) rejected the normality of the residuals, and therefore led us to e xclude a sigmoid model as a possible functional form of τ ( ε ) . Next, we fitted a GP emulator to the av erage user lifetime data, using the open source machine learning toolkit from the SciKits collection 2 W e then discarded the simulator and used ˆ τ ( θ ) in lieu of it. T o compute the sensitivity indices we used the W inding Stairs (WS) method, a resampling technique proposed in [14]. W e computed main ( M i ) and total interaction ( T i ) ef fect indices for each parameter ( i = 1 . . . 10 ) using a WS matrix with 10 4 rows. The results are shown in T ab . 2. The total v ariance ˆ σ 2 was also computed from W (each column of a WS matrix is an independent sample). The WS method yields better estimates of the total interaction effects than other methods [5], so we impute the presence of some slightly negati ve values of M i to the uncertainty in the estimation of the total output v ariance σ 2 and to the presence of factors with almost null total ef fect. Only two factors have a T i > 3% . These are the confidence ε , and the long term lifetime τ 0 . W e explored further the individual contribution of each parameter in the output variance by looking at the main effect plots. These are plots of Y ( θ i ) as a function of θ i , and can be obtained e valuating Eq. (6) using Monte Carlo a veraging and the GP emulator . T o facilitate comparison of the different parameter ranges, in Fig. 1b we plotted the main effect as a function of the scaled parameter value. Figure 2 Home page: http://scikit-learn.sourceforge.net/. 8 0 10 20 0 35 70 105 140 < τ > (days) daily edits 0 10 20 daily users 0 10 20 daily pages 0 . 00 0 . 25 0 . 50 confidence 0 . 00 0 . 25 0 . 50 sp eed 0 50 100 const succ 0 50 100 const p op 0 . 0 0 . 5 1 . 0 rollbac k prob 0 . 0 0 . 5 1 . 0 short life 10 55 100 long life Figure 2: Scatter plots of  τ  versus θ = ( λ e , ρ u , ρ p , ε, µ, c s , c p , r , τ 0 , τ 1 ) . Error bars (standard error of the mean lifetime computed over 10 realization) are all smaller than the data points. 1b shows that ρ p and τ 1 hav e a slight effect on user lifetime too, the first negati ve and the second positiv e. The dif ference between T i and M i is the fraction of variance that is only due to interactions between θ i and any other parameter or groups of parameters. For ε this difference is 0 . 08 and for τ 0 it is 0 . 05 . Summed up together , this residual interaction effect amounts to almost three quarters (77%) of the total interaction effects from all remaining parameters. Thus we expect ε and τ 0 to hav e some interesting interactions with other parameters. W e explored two-w ay interactions systematically using two- way interaction plots, which are the 3D counterparts of the curves of Fig. 1b. Giv en two parameters θ i and θ j , with i 6 = j , we computed Y i,j  θ i , θ j ) : we e val- uated Eq. (6) in a similar way , this time holding fixed the v alues of tw o parameters instead of one. Here we report the results on the interaction between ε and other pa- rameters, included τ 0 . The plots are shown in Fig. 3 and 4. Almost all parameters sho w just a weak interaction with ε , which occurs at lo w ( ε < 0 . 1 ) and high ( ε > 0 . 4 ) v alues of it. Only the pair { ε, τ 0 } shows a significant degree of interaction. 4.4 User lifetime distribution Previous studies on continuous opinion dynamics under bounded confidence show that, as ε grows, the population of agents undergoes a gradual change from a regime with no consensus, to a re gime of total consensus with a single cluster [9, 12]. In our model this 9 T able 2: V ariance decomposition. W inding Stairs sample size 10 4 rows, total v ariance 635 . 365 days 2 . Parameter M i T i λ e -0.002 0.014 ρ u -0.003 0.02 ρ p 0.003 0.027 ε 0.65 0.73 µ -0.004 0.03 c s 0.004 0.03 c p -0.005 0.016 r -0.005 0.026 τ 1 0.002 0.03 τ 0 0.18 0.23 shift must reflect someho w in the average user lifetime, but what shape the user lifetime distribution takes during it? The findings from the previous section let us restrict the field of study to just two parameters of the original ten, namely ε and τ 0 . In this section we focus only on them, and try to understand what is the actual distribution of user lifetimes, by simulating from our model. W e performed simulations holding fixed the user lifetime parameters ( τ 0 = 100 days and τ 1 = 1 hour), while changing the value of ε . The v alues of all other pa- rameters were fixed to the midpoints of the respectiv e ranges listed in T ab . 1. W e computed the log-lifetime u = log ( τ ) and fitted a 2-components Gaussian Mixture Model (GMM) to u . Figure 5 reports the result of the fitting, showing the densities of the individual components using stacked area plots. W e report here only two values of ε , ε = 0 and ε = 0 . 3 , which is a value greater than the threshold for consensus in Deffuant’ s model, to sho w the difference between the two re gimes. 5 Discussion In this section we discuss the main findings of the present study . W e presented an agent-based model of user participation in a peer-production community . W e model participation as a bounded confidence consensus process, where users modify content according to their objectiv es and skills (represented by a continuous state x ), and are in turn indirectly influenced by the rest of the community . W e use global sensitivity analysis to study the importance of the model’ s parameters in explaining the average user lifetime. The first interesting – and rather surprising – finding is that, as shown in T ab . 2, of the overall ten parameters of the model, only two affect the av erage user lifetime in a considerable w ay . This is interesting because it suggests that sev eral other factors like content popularity , user community growth, and user activity rate, are not as important as the general lev el of “tolerance” of the community (given by the confidence ε ) in affecting the process of group consensus. Moreover , interaction plots show that 10 relev ant interactions occur between ε and τ 0 : this confirms the intuition that the role of τ 0 is to set the support of the distribution of τ , and that ε acts as a switch, controlling the transition from a regime where only short-term forms of participation are possible, due to the lo w rate of successful user-page interactions, to a consensus re gime where a cluster of long-term users is able to emerge. Of course, the results from the factor screening should be also vie wed in light of our simulation setup. W e decided to focus on a stable community , where the num- ber of users N u is stationary , and not on the initial phase of community formation. Plausibly , during this transient phase other parameters, such as the speed µ , and the rollback probability r , might hav e more importance in determining the span of user participation. The second interesting finding is about the actual distrib ution of user participation, which is markedly bimodal. From Fig. 5 it is possible to appreciate, for ε = 0 . 3 , a clear subdivision in two groups of users based on their participation span. W e can see also a subdivision for ε = 0 , which is probably related to the fact that c s = 50 in that setup. Although we did not perform a proper model calibration, this finding is encouraging, as previous studies on the distrib ution of user accounts lifetime in W ikipedia hav e shown a similar bimodal pattern [7]. In general, both findings show that agent-based model can be studied through the systematic use of simulations and computer code emulation, and provide a novel con- nection between model of opinion dynamics, whose study has been so far notoriously lacking on the empirical side [4, 27], and peer production. 5.0.1 Acknowledgments. Alberto V ancheri and Paolo Giordano for the insightful discussions; the anonymous revie wers, for the suggestions on ho w to improve the manuscript; the conference or ga- nization, for their generous financial support. Refer ences [1] Backstrom, L., Kumar , R., Marlow , C., Novak, J., T omkins, A.: Preferential be- havior in online groups. WSDM’08 pp. 1–11 (Dec 2007) [2] Beschastnikh, I., Kriplean, T ., McDonald, D.W .: W ikipedian self-governance in action: Motiv ating the policy lens. In: Proceedings of the second ICWSM con- ference (2008) [3] Carletti, T ., Fanelli, D., Guarino, A., Bagnoli, F ., Guazzini, A.: Birth and death in a continuous opinion dynamics model. Eur . Phys. J. B 64(2), 285–292 (Jul 2008) [4] Castellano, C., Fortunato, S., Loreto, V .: Statistical physics of social dynamics. Rev . Mod. Phys. 81(2), 591–646 (May 2009) [5] Chan, K., Saltelli, A., T arantola, S.: W inding stairs: A sampling tool to compute sensitivity indices. Statistics and Computing 10, 187–196 (2000), http://dx. doi.org/10.1023/A:1008950625967 11 [6] Choi, B., Alexander , K., Kraut, R.E., Levine, J.M.: Socialization tactics in wikipedia and their effects. In: CSCW ’10: Proceedings of the 2010 A CM confer- ence on Computer supported cooperative work. pp. 107–116. ACM, New Y ork, NY , USA (2010) [7] Ciampaglia, G.L., V ancheri, A.: Empirical analysis of user participation in online communities: the case of wikipedia. In: Proceedings of ICWSM (2010) [8] Ciffolilli, A.: Phantom authority , self-selectiv e recruitment and retention of members in virtual communities: The case of wikipedia. First Monday 8(12) (Dec 2008), http://firstmonday.org/issues/issue8_12/ ciffolilli/index.html [9] Deffuant, G., Neau, D., Amblard, F ., W eisb uch, G.: Mixing beliefs among inter- acting agents. Adv . Comp. Sys. 3, 87–98 (2001) [10] Goldman, E.: W ikipedia’ s labor squeeze and its consequences. T elecomm. and High T ech. La w 8, 157–184 (August 2009) [11] Grabowski, A., Kosi ´ nski, R.A.: Life span in online communities. Phys. Rev . E 82(6), 066108 (Dec 2010) [12] Hegselmann, R., Krause, U.: Opinion dynamics and bounded confidence– models, analysis, and simulation. J. Art. Soc. Soc. Sim. 5(3), paper 2 (2002), http://jasss.soc.surrey.ac.uk/5/3/2.html [13] Hogg, T ., Lerman, K.: Stochastic models of user-contributory web sites. pp. 50– 57 (2009) [14] Jansen, M., Rossing, W ., Daamen, R.: Monte-Carlo Estimation Of Uncertainty Contributions From Sev eral Independent Multiv ariate Sources. In: Grasman, J., van Straten, G. (eds.) Predictability And Nonlinear Modelling In Natural Sciences And Economics. pp. 334–343 (1994) [15] Kittur , A., Kraut, R.E.: Beyond wikipedia: Coordination and conflict in online production groups. pp. 215–224 (2010), http://dx.doi.org/10.1145/ 1718918.1718959 [16] Leskov ec, J., Backstrom, L., Kumar , R., T omkins, A.: Microscopic ev olution of social networks. In: KDD ’08: Proceeding of the 14th ACM SIGKDD interna- tional conference on Knowledge discov ery and data mining. pp. 462–470. ACM, New Y ork, NY , USA (2008) [17] Lin, H.F ., Lee, G.G.: Determinants of success for online communities: an empiri- cal study . Beha viour & Information T echnology 25(6), 479–488 (No v-Dec 2006) [18] Lorenz, J.: Continuous opinion dynamics under bounded confidence: A survey . Intl J. Mod. Phys. C 18, 1819–1838 (2007) 12 [19] McKay , M.D.: Latin hypercube sampling as a tool in uncertainty analysis of com- puter models. In: Proceedings of the 24th conference on W inter simulation. pp. 557–564. WSC ’92, A CM, New Y ork, NY , USA (1992), http://doi.acm. org/10.1145/167293.167637 [20] Ortega, F ., Izquierdo-Cortazar , D.: Surviv al analysis in open dev elopment projects. In: Proceedings of the 2009 ICSE W orkshop on Emerging T rends in Free/Libre/Open Source Software Research and Dev elopment. pp. 7–12. FLOSS ’09, IEEE Computer Society , W ashington, DC, USA (2009), http://dx. doi.org/10.1109/FLOSS.2009.5071353 [21] Panciera, K., Halfaker , A., T erveen, L.: W ikipedians are born, not made. In: Proceedings of GR OUP‘09 (2009) [22] Raafat, R.M., Chater , N., Frith, C.: Herding in humans. T rends in Cogniti ve Sciences 13(10), 420 – 428 (2009), http://www. sciencedirect.com/science/article/B6VH9- 4X6PPCY- 1/ 2/38f26f1994570f7a58d587bb5a7a0569 [23] Rafaeli, S., Ariel, Y .: Psychological aspects of cyberspace: Theory , research, applications, chap. Online Motiv ational Factors: Incenti ves for Participation and Contribution in W ikipedia, pp. 243–267. Cambridge Uni versity Press (2008) [24] Saltelli, A., T arantola, S., Campolongo, F ., Ratto, M.: Sensiti vity Analysis in Practice–A guide to Assessing Scientific Models. John Wile y & Sons, Ltd. (2004) [25] Santner , T ., Williams, B., Notz, W .: The Design and Analysis of Computer Ex- periments. Springer-V erlag, New Y ork (2003) [26] Schroer , J., Hertel, G.: V oluntary engagement in an open web-based encyclope- dia: W ikipedians and why they do it. Media Psychology 12(1), 96–120 (2009) [27] Sobko wicz, P .: Modelling opinion formation with physics tools: Call for closer link with reality . Journal Artificial Societies and Social Simulation 12(1), 11 (2009), http://jasss.soc.surrey.ac.uk/12/1/11.html [28] Sobol’, I.M.: Global sensiti vity indices for nonlinear mathematical models and their monte carlo estimates. Mathematics and Comput- ers in Simulation 55(1-3), 271–280 (February 2001), http://www. sciencedirect.com/science/article/B6V0T- 42DX509- 11/ 2/7992ee21d186afc323213675e8547d6f [29] W ilkinson, D.M.: Strong regularities in online peer production. In: Proceed- ings of the 9th A CM conference on Electronic commerce. Chicago, Illinois USA (2008) [30] W ilkinson, D.M., Huberman, B.A.: Cooperation and quality in wikipedia. In: Proceedings of W ikiSym ’07, 3rd Intl Symposium on Wikis. Montr ´ eal, Qu ´ ebec, Canada (October , 21–23 2007) 13 [31] Y ang, J., W ei, X., Ackerman, M., Adamic, L.: Activity lifespan: An anal- ysis of user surviv al patterns in online kno wledge sharing communities. In: Proceedings of the International AAAI Conference on W eblogs and Social Media (2010), http://aaai.org/ocs/index.php/ICWSM/ICWSM10/ paper/view/1466/1856 14 long life 20 30 40 50 60 70 80 90 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 80 short life 0 . 2 0 . 4 0 . 6 0 . 8 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 sp eed 0 . 1 0 . 2 0 . 3 0 . 4 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 rollbac k prob 0 . 2 0 . 4 0 . 6 0 . 8 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 Figure 3: T w o-way interaction plots 15 daily users 5 10 15 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 daily pages 5 10 15 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 daily edits 5 10 15 confidence 0 . 1 0 . 2 0 . 3 0 . 4 lifetime 10 20 30 40 50 60 70 Figure 4: T w o-way interaction plots (cont’ d) 16 − 5 − 4 − 3 − 2 − 1 0 1 2 u = log( τ ) (days) 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 1 . 2 Prob. Density p ( x ) ε = 0 − 8 − 6 − 4 − 2 0 2 4 6 u = log( τ ) (days) 0 . 00 0 . 05 0 . 10 0 . 15 0 . 20 0 . 25 0 . 30 0 . 35 Prob. Density p ( x ) ε = 0 . 3 Figure 5: GMM fit of log-lifetime of user accounts in two different runs of the model. For ε > ε c a bi-modal pattern is a clear feature of user participation. 17

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment