Towards Realistic Vehicular Network Modeling Using Planet-scale Public Webcams

Realistic modeling of vehicular mobility has been particularly challenging due to a lack of large libraries of measurements in the research community. In this paper we introduce a novel method for large-scale monitoring, analysis, and identification …

Authors: Gautam S. Thakur, Pan Hui, Hamed Ketabdar

Towards Realistic Vehicular Network Modeling Using Planet-scale Public   Webcams
T o war ds Realistic V ehicular Netw ork Mo deling Using Planet-scale Public W ebcams Gautam S. Th akur ‡§ P an Hui ‡ Hamed K etabdar ‡ Ahmed Helmy § § CISE, Unive rsity of Florida, G ainesville , ‡ Deutsche T elekom Laboratories, Berlin gsthakur@cise .ufl.edu, pan.hui@telek om.de, hamed.ketab dar@telek om.de, helmy@cise .ufl.edu ABSTRA CT Realistic modeling of vehicular mobility h as been pa rticu- larly challengin g d ue to a lack of large librar ies of mea- surements in the research c ommun ity . In this p aper we in- troduce a novel metho d fo r large-scale monitor ing, an aly- sis, and identification of spatio-te mporal models f or vehic- ular mobility using th e freely av a ilable online webcams in cities a cross the globe. W e collect vehicular mobility traces from 2 ,700 traffic webcams in 10 different cities for several months and generate a mobility dataset of 7.5 T erabytes con- sisting o f 1 25 million o f ima ges. T o the be st of our knowl- edge, this is the largest data set ever used in such study . T o pro cess and analyze this data, we propo se an efficient and scalable algorithm to estima te traffic d ensity based on backgr ound image subtraction . Initial results show that at least 82 % o f individual camer as with less than 5% devia- tion from four cities follow Loglogistic distribution and also 94% camer as from T oron to follo w gamma distribution. The aggregate resu lts from each city also dem onstrate that Log- Logistic and gamma d istribution pass the KS-test with 95% confidenc e. Fu rthermo re, many of the camera trace s exhibit long r ange depe ndence, with self-similarity evident in th e aggregates of traffic (per city). W e be liev e o ur n ovel d ata collection metho d an d dataset pr ovide a muc h needed con- tribution to the research community for realistic modeling of vehicular networks and mobility . 1. INTR ODUC TION Research in the a rea of v ehicular netw or ks has in- creased dramatically in recent years. With the prolifer - ation of mobile netw orking technologies and their inte- gration with the automobile industry , v ar ious forms of vehicular netw or ks a r e b eing r ealized. These netw or ks include vehicle-to-vehicle, vehicle-to-roadside, a nd vehicle- to-roa ds ide-to-vehicle architectures. Realistic mo del- ing, simulation and informed de s ign of such netw o rks face s everal challenges, mainly due to the lack o f lar ge- scale communit y- wide libraries of vehicular data mea- surement, and representativ e mo dels of vehicular mo- bilit y . Earlier studies in this area have clea rly established a direc t link b etw een vehicular densit y distribution and the per formance [16, 3] of vehicular net works primitives and mechanisms, including broa dc a st a nd geo cast pro - to cols[1]. Although go o d initial efforts hav e b een ex- erted to capture re alistic vehicular density distributions, such e fforts were limited by av ailability of sensed vehic- ular data[20]. Hence, there is a real need to conduct vehicular density mo deling using large r scale a nd more comprehensive data sets. F urthermor e , commonly used assumptions, such a s exp o nential distribution[1 9] of ve- hicular inter-arriv al times[1], have b een used to derive many theories and conduct several analyse s, the v a lidity of which b ea rs further investigation. In this study , we provide a novel framework fo r the systematic monitor ing, measurement, ana ly sis and mo d- eling of vehicular density dis tr ibutions at a large scale. T o avoid the limitations of sensed vehicular data, we instead utilize the existing g lobal infra structure o f tens of thousa nds of video ca meras providing a con tin uous stream of s treet images fro m doz ens of cities ar ound the world. Millions of ima ges ca ptured fr om publicly av ail- able traffic w eb ca meras are pr o cessed using a nov el density estimatio n algorithm, to help inv es tigate and understand the traffic pa tterns of cities a nd ma jor high- wa ys. Our alg orithm employs s imple, scalable , and ef- fective background subtraction techniques to pro cess the imag es and build an extensive librar y of s patio- tempo ral vehicular density data. As a first step tow ard realistic vehicular netw o rk mo d- eling, we aim to provide a comprehensive view of the fundamen tal statistical characteristics of the vehicular traffic density exhibited by the data fr om four ma j or cities ov er 45 days. Two main s ets o f statistical ana l- yses a r e conducted. The first includes an inv estigatio n of the b est-fit distribution for the arriv al pro cess using v arious camer a s a nd aggreg ate city data, while the sec- ond is a s tudy of the lo ng range dependenc e (LRD) a nd self-similarity observed in the data. Our ea rly analys is show t wo main results: i) the empirical distribution of vehicular densities in most o f the cameras and cities fol- low ‘log -logistic’ and ‘gamma’ distributions. ii) Co nsis- 1 ten tly , the da ta show ed a high deg ree of self-similarity ov er order s of magnitude of time scales, in a ll cities and fo r many ca meras. This s uggests a long- range- depe ndent pro cess governing the vehicular arr iv al pro - cess in ma ny rea listic scenarios. Such result is in shar p contrast to the as sumptions of memoryless pro ces ses commonly used for vehicular mo bilit y . The contributions of this work are manifold. (i) T o the b est o f our knowledge, we pr ovide by far the lar gest and most ex tens ive librar y of vehicular density data, based on pro ce s sing of millions of images obtained from ten main cities and thousands of cameras. This a d- dresses a severe shortag e of such data sets in the co m- m unit y . The libra ry will b e made av a ilable to the re- search communit y in the future. (ii) W e prop ose a fast algorithm for traffic density estimation to efficiently pro cess millions of image files. (iii) W e establish log- logistic and gamma distributions as the most suitable fits for the v ehicular density distr ibution and provide early evidence of self-simila rity exhibited by the tra ffic at v arious time sca les. The r est o f the do cument is outlined as follows. Sec - tion 2 discusse s rela ted w ork. In Section 3, we dis- cuss our vehicular da ta set. In Se c tio n 4 , we discuss our background subtrac tio n algo r ithm, and detection and remov al of outliers. Statistical analysis of mea sure- men ts and mo deling is illustra ted in Section 5. Finally we conclude our paper in Section 6 and give insight int o the future work. 2. RELA TED WORK Large sca le mobility datas e ts are very imp o rtant for the mobile net work and computing resear ch communit y , but collecting them is even more challenging and usu- ally exp ensive [8]. In this pap er , we pr op ose an inex- pens ive metho d to collect global scale vehicular mo bilit y traces using thousands of freely av aila ble web c ams that provide contin uous and fine- grained monitoring of the vehicular traffic. Existing studies in transp orta tion sciences fo cus on improving r oad traffic and use of structural engineer- ing metho ds to r e solve is sues o f conges tion, ev acuation, and mitiga tion pla ns. Initial w ork[5] mainly fo cused on dev eloping infrastructure for mov ement of vehicles on r oads a nd br idges. Ho wev er, in the recent times[7] m uch fo cus has b een g iven to the use of sensor data. The later helps to engineer b etter traffic conditions, en- suring s a fety and manag ement of traffic. F or example, inductive lo op detector s are equipp ed to monitor tra ffic flows. How ever, the av ailability of the data gener ated from these sensors is not rea dily av aila ble to the gen- eral public. Second, studies[4] do not necess arily fo cus on vehicular ne tw orks, traffic mo deling, and character- ization. In spite of data av aila bility pr oblems, surpris- ingly ther e is a larg e deployment of publicly av ailable online web camer as, which can b e used to monitoring and mo deling traffic. In our work, we take adv antage of these free webcams. T o our knowledge we ar e the first to identify the p ower and u sability of these fr e e web c am- er as for t he purp ose of mo deling and char acterizing the tr affic acr oss glob e . Sim ulation too ls like CORSIM[7] and VISSIM[11] a re geared to model sp ecific sce na rios for planning future traffic conditions on a micr o-mobility and s mall sca le level. In this work, we fo cus on the a s p ect of macro - mobility to mo de l vehicular movemen ts in form of flow densities to analyze tra ffic on huge sca le. F rom a net- working per s p e ctive, mobility mo dels[4, 14] and routing [21] techniques investigate how mobility impact the p er- formance of ro uting proto cols [2]. If the mobility mo del is unr ealistic then routing p er fo rmance is questionable. So, we need models inspired fro m real data sets. B y wa y of this work, we b elieve a compr ehensive set o f pa - rameters can be extr acted to develop such mo dels. In a recent work, Bai et. al [1] analy z e d spatio- tempo ral v ariations in vehicular tra ffic fro m the purp ose of in ter -vehicle comm unications. Data collected fro m realistic scenarios shows the effectiveness of exp onen- tial mo del for highw ay vehicle traffic. On the same line, quantitativ e characteristics o f vehicle arriv al pattern on highw ays is studied in [13]. By using rea l hig hwa y traffic data, the study examines the ex istence of self-s imilarity characteristics on v ehicle arriv al data a nd finds that time headwa y of v ehic le s on the highw ays follo ws the heavy-tailed distribution. These findings enrich tra ffic mo deling, but carried out on very small sample of data and mainly loc a lized to one or tw o lo cations. In o ur study , we use 45 days of vehicular imagery da ta from four cities to mo del tr affic and characterize the densit y distribution. A pr inc iple a c tiv ity rela ted to our work is image pr o - cessing and efficient retr iev al of traffic information from these images. Ma ny studies[5] have been car ried o ut that lo o k in to a sp ects of b o th background subtraction[15, 17] and ob ject detectio n[10]. In former metho ds[6], dif- ference in the current and reference frame is used to ident ify ob jects. In detection a pproaches[18], learning the ob ject features (shap e, size etc.) are used to detect and classify them. In our work, we are using a tem- po ral metho ds for background subtr a ction to ca lc ula te a rela tive numerical v alue instead of counting cars . In our work we find background subtraction is muc h faster than ob ject detection, which is discussed in detail in later section. 2 T able 1: Glo b al W eb cam Datasets Cit y # o f Cameras Duration In terv al Records Database Si ze Bangalo r e 160 30/Nov/10 - 0 1/Mar / 11 180 s ec 2.8 million 357 GB Beaufort 70 30/Nov/10 - 0 1/Mar / 11 30 sec. 24.2 million 1150 GB Connecticut 120 21/Nov/10- 2 0/Jan/ 11 20 sec. 7.2 million 43 5 GB Georgia 777 30/Nov/10 - 0 2/F eb/11 60 sec. 32 million 1400 GB London 182 11 /Oct/1 0 - 2 2/Nov/10 60 sec. 1 million 201 GB London(BBC) 723 30/Nov/10 - 01/Mar/ 11 60 sec. 20 million 1050 GB New y ork 160 20/Oct/10 - 13/ Jan/1 1 15 sec. 26 million 1200 GB Seattle 121 30/Nov/10 - 0 1/Mar / 11 60 sec. 8.2 million 60 0 GB Sydney 67 11/Oc t/ 10 - 05/Dec/1 0 3 0 sec. 2.0 million 35 0 GB T oronto 89 21/Nov/10 - 2 0/Jan/ 11 3 0 sec. 1.8 million 32 5 GB W ashington 240 30/Nov/10 - 0 1/Mar / 11 60 sec. 5 million 400 GB T otal 270 9 - - 125.2 mil lion 7468 GB Figure 1: Infrastructure for m e asuremen t col - lection (a) London (b) Sy dney Figure 2: T raffic cameras in London and Sydney . The red dots show the loca tion o f cameras deplo yed . 3. D A T A COLLECT ION There a re thous a nds, if not millions, of outdo or cam- eras curr e ntly connected to the Internet, which ar e placed by g ov ernments, compa nies, conserv atio n so cieties, na- tional parks, universities, and priv ate citizens . Out- do or webcams are usually mounted on a roadside p ole with easy acc e ssibility , installation and maintenance, and they ha ve seen enormous applications not only in adaptive traffic co ntrol a nd information sys tems , but also in monitoring the weather co nditions, advertising the b eauty of a pa r ticular b each or mo unt ain, or pr ovid- ing a view of animal or plant life at a par ticular lo cation. W e view the connected global net work of webcams as a highly versatile platform, enabling an un ta pped p oten- tial to monitor global trends, o r changes, in the flo w o f the city , and providing lar g e-scale data to realis tically mo del vehicular, or even human, mobilit y . In this section, w e introduce the metho do logy for the data collec tio n and give a high level statistics of the data tra ces. W e c o llect vehicular mobility traces using the online webca m cr awled by our crawler. A ma jority of these w eb cams are deploy ed by the Department of T ransp or ta tions (DoT) in ea ch city . They ar e used to provide r eal time informatio n ab out roa d tr a ffic condi- tions to general public via online tra ffic web camer as. These web camera s are basica lly installed on tr a ffic sig- nal p oles facing to wards the roads of some prominent int ersections throughout city and hig hw ays. At regular int erv al of time, these ca mera captures s till pictures o f on-going r oad tra ffic a nd send them in form of feeds to the DoTs media server. F o r the pur p o se of this study , we chose 10 cities with lar ge n um be r of webcam cov er - age and to ok the p ermissio n from concerned DoTs to collect these vehicular image r y data for several mon ths. W e cov er cities in Nor th America, Europ e, Asia, and Australia. In Fig.-1, we show our ex p er imental infras - tructure to do wnlo ad and maintain the imag e data. Since these ca meras provide b etter ima gery during the daytime, we limit our study to download and analyze them only dur ing such hours. O n av er a ge, we down- load 15 Giga bytes o f imager y data per day from ov er 4700 traffic web cameras , with a overall dataset of 6.5 T erabytes and co nt aining ar ound 1 20 millions images. T able-1 shows the high level statistics of data s ets we collected. Each city has a differen t num b er o f deploy ed 3 cameras and a different interv a l time to capture images. F or example, cameras for the city of Sy dney capture im- ages at an int erv al of one minute while for the s tate of connecticut the interv al time b etw een tw o consecutive snapshots is o nly 2 0 seco nds. The wide s pread g eo- graphical deployment of these cameras cov er ing ma jor sections of city and highw ays. Fig.- 2 give an example of the camera deployment s in the city of London a nd Syd- ney by ma pping the Globa l Positioning System (GPS) lo cation o f the cameras to Goo gle maps. The area cov- ered by the cameras in London is 9 50 k m 2 and that in Sydney is 1500 k m 2 . Hence, we b elieve our study will be comprehe ns ive and will reflect ma jor trends in traffic mov ement of cities. 4. ALGORITHM T O EXTRA CT TRAFFIC DENSITIES W e a im to estimate tra ffic density on roads co ns id- ering the num b er of vehicles or p e destrians cr ossing the road. W e hav e a sequence o f imag es ( I 1 ( x, y ) + I 2 ( x, y ) ... + I z ( x, y )) captured by webcams. Considering our pro blem, w e ha ve to be able to se parate infor ma- tion we need, e.g. num b er of vehicles and p edestrians from the back ground imag e which is nor mally ro ad and buildings around. The main factor that ca n distinguish betw een vehicles a nd background ima ge (r oad, build- ings) is the fact that the vehicles ar e not in a stationary situation for a long p erio d of time, how ever the back ground is statio nary . The solution for the problem then seems to b e a pplying a s ort of hig h pas s filter ing ov er a sequence of imag es captured by a webca m over time. The high pa ss filter remov e s the stationary pa rt of the images (ro ad, buildings, etc.), and keeps the moving comp onents (mainly v ehicles). In order to implemen t such a high pa ss filter, w e subtract re sult o f a low pass filter ov er a sequence o f images, fro m each still image. This is practica lly eq ua l to implemen ting a high pass filter over sequence of images . In order to o btain low pass filtering effect, we run a moving av erage filter over a time sequence o f imag es obta ined fro m o ne webcam. The duration of moving av erag e filter can be adjusted in an adho c way . The moving av erag e filter is simply im- plement ed by av era ging over intensit y ma p for several images in a certain dur a tion. A t the output o f mov- ing av era ge filter, the intensit y of ea ch pix e l is obtained by averaging intensit y o f corres p o nding pixels in the in- terv al. The output of the moving av era ge filter (low pass filter) is normally the r equired ba ckground image, which is still imag e of street and buildings . Therefore , subtracting ea ch image fro m the output of lo w pass fil- ter, gives us the mo ving comp o ne nts (e.g. v ehicle s ). This is in fa ct the high pa s s comp o nent of the image ov er time. Having the high pas s comp onent of the image, the ve- hicles are highlighted from background. One may then use reg ular ob ject detection techniques to identif y a nd count num b er of vehicles in the high pass filter ed im- age. How ever, applying such techniques ma y require heavy lo ad of computation, and in the s ame time it ca n be unnecess ary . As an alternative, w e s imply counting nu mber of a c tive pixels (pixels with a v alue hig her than a cer tain threshold). Such a pro ces s can b e muc h faster than detecting and co unt ing o b jects in a n imag e. In the same time, it can be muc h more effective, b ecause we a re lo oking for the p ercentage of the street (road) which is covered b y vehicles (as an indicato r of how crowded is the street), rather than num b er of vehicles. Num b er of vehicles can not be necessar ily a go o d indi- cator of crowdedness, as a long vehicle may intro duce more traffic than a small one. Secondly , it o vercomes the issues that ob ject detection alg orithm face in con- ditions of severe co ngestions. One o f them is visibility of bo undary contours used to separ a te ob jects from one another. In contrary , counting num b er of active pixels can indicate wha t p er centage of the road is covered, no matter ho w many vehicles a re in the r oad. Said that, consider a n imag e ca n b e re presented as I ( x, y ) = L ( x, y ) + T ( x, y ) + N ( x, y ) where I ( x, y ) is the captured image, L ( x, y ) is our low pass filter and T ( x, y ) and N ( x, y ) are resp ectively the traffic and asso ciated noise with the images. In first step, we genera te a low pass filter using the afor emen- tioned technique o f moving av er age. Initially , we av e r- age a g ive data pixel with its r ight and left ne ig hbors. F or the purp ose o f this study , we k e pt the n umber of its neighbors z = 1 00. The averaging r esults in the remov a l of dominant trends. These dominant tr ends are T ( x, y ) and N ( x, y ). This low pas s filter remains consta nt for one c amera, L ( x, y ) = ( I 1 ( x, y ) + I 2 ( x, y ) ... + I z ( x, y )) /z T o get the tra ffic densit y as so ciated with an image we subtract the low pass filter a nd set a threshold ( τ ) to reject a resulted pixel v a lue b e low it so as to reduce the effect o f no is e (shadows etc.) N ( x, y ). In summary , I ′ ( x, y ) = I ( x, y ) − L ( x, y ) Such that I ′ ( x, y ) > τ . Later, we convert the image to g rayscale I ′′ ( x, y ) and sum the pixels to get the traffic density ( d ). d = m X x =0 n X y =0 I ′′ ( x, y ) Outliers Detection and Removal An impor tant a sp ect o f colle cting images on s uch a larg e scale r e quires a utomated pr o cesses to manag e and ex- tract useful informatio n. As mentioned, different cam- eras hav e different refr eshing r ate, we have to contin- uously download ima ges at a sp ecific time- interv al for 4 0 2000 4000 6000 8000 10000 12000 14000 16000 0 10000 20000 30000 40000 Image Count Traffic Densities 5000 10000 15000 20000 25000 30000 35000 Outliers (a) Ou tliers Presen t 0 2000 4000 6000 8000 10000 12000 14000 16000 0 2000 4000 6000 8000 10000 Image Count Traffic Densities 2000 4000 6000 8000 (b) Outliers Remov ed Figure 3: O utliers detection and rem o v al. (a) Outliers detection by encircling them (b) F ac- tual traffic densi t y distributio n. each camer a. T o ensure that we are not missing even a single traffic sna pshot, we k ee p our download time- int erv al a little sho rter than the camer a refres hing rate. How ever, this results in few duplicate images that we filter out a s a first step tow a rds o utliers detection and remov al. Norma lly , the do wnloaded data set contain images, which a re the snapshot of vehicular tr a ffic on the roads . But in many ins ta nces, the imag es ar e cor- rupted with zero sized or with extra neous bytes (noise). Next, if the camer a instrument is non-functiona l or ha s mechanical erro rs, the traffic monitoring s erver replaces current traffic snapshot with error notification ima g e. The challenge here is to detect all s uch error s and remov e them b efore mo deling and statistica l ana lysis. The analysis b ecome more complex as we do not know the kind of distribution underlying and hence a ny statis- tical techniques that r ely on some distribution (boxplot etc) cannot b e used. W e used semi-sup ervised learning and data mining to overcome the c ha llenges of outliers detection and remov al in millio ns o f traffic ima ges. In our case, we tre a t data set X containing all types of images as X = { x i , x 2 , x 3 , ..., x n } . Later on we di- vide this se t into tw o parts: the data po int s in X l = { x 1 , x 2 , x 3 , ..., x l } mapp ed to lab els in Y l = { y 1 , y 2 , y 3 , ..., y l } . The pr ovided input features includes but not limited to image size, colo r depths, multi-c hannel co lor arrays and image se g mentation stderrs for detecting o utliers. The second pa rt contains p oints with unknown lab els r epre- sented as X u = { x l +1 , x l +2 , x l +3 , ..., x l + u } such that u >> l . The already known and learned lab eled p oint ar e later used to find cluster b oundarie s and assigning class to each cluster . In this case, we used low density separa tion ass ump- tion that help to cut the datase t into clusters . The ident ified clusters ar e separated o ut as outliers, whic h are mostly distant from the regular tra ffic dens it y data. In Fig-3, w e compare the results of detecting and r e- moving the outliers. (a) d = 2023 , 0 . 28 (b) d = 5400 , 0 . 55 (c) d = 9230 , 0 . 93 Figure 4 : A serie s of pictures for same in te r- section but v arying [(a)low/(b)medium/(c)high] traffic intensities. This v ariation is captured b y density parameter d . The first v alues is the re- sult o f bac kground subtraction and later is the normalized v alue. Figure 5: T raffic arriv al pro cess on hourly basi s for 45 da ys. A regul ar pattern of hi gh traffic in te n s it y during morning and ev ening ho urs is eviden t. 5. T O W ARD REALISTIC V EHICULAR NET - WORK MODELING As a fir st step tow ard rea listic mo deling of vehicu- lar commun ication netw o rk, we fo cus on tw o s tudies of traffic ar riv al pr o cess in this pap er: mo deling the den- sities ( d ) against well known probability distributions and analyzing the typical traffic burstiness using self- similarity analys is . The ob jectiv e of this study thus help to understand the under ly ing s ta tistical pa tterns and mo del the a r riv al pro cesses. The mo dels a re s e- lected based on their applicability in every day s tatisti- cal analy sis and by several itera tions of mo deling that show ed the traffic close ly fo llow ( less deviation ) one or more of the disc us sed pr obability distributions. Due to page limit and as ear ly study , in this section we will only pres ent results from 4 represe nt ed cities (London, Sydney , T oronto, and Connecticut) with in total 45 8 cameras and 1 2 millio n images . An imp ortant and un- derlying fact ab out the tr a ffic de ns ities is the approxi- mation to re lative traffic o n the roa ds. This assumption is different fr om counting cars using loo p detectors o r other sensors . As shown in the Fig.-4, we depict three traffic scena rios of v ary ing intensities from low to fully 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Connecticut Traffic Exponential Gamma Log−Logistic Normal Weibull (a) Connecticut 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) London Traffic Exponential Gamma Log−Logistic Normal Weibull (b) London 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Sydney Traffic Exponential Gamma Log−Logistic Normal Weibull (c) S ydney 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Toronto Exponential Gamma Log−Logistic Normal Weibull (d) T oron to Figure 6: Mo de l ing the dis tribution for aggregate traffic densiti es. Cit y 1 st Best Fit 2 nd Best Fit 3 r d Best Fit Connecticut L[87%] G[11%] E[0.5%] London L[42%] G[39%] W[16%] Sydney L [6 2 %] G[32%] N[2%] T oronto G[46%] W[31%] L[21%] E=Exp o ne ntial. G=Gamma, L= Loglog istic, N= Normal, W=W eibull T able 2: Dominan t dis tribution as Best Fits[ By Ranking] Cit y 6 3% 6 5% Connecticut L[62%], G[15%], W[3%] L[94%], G[44%], W[19%] London G[34%], L[34 %], W[10 %], N[0.5 %] L[8 2%], G[70 %], W[47%], N[7%] Sydney L[88%], G[61%], W[4 %], N[2%] L[98%], G[88%], W[4 4 %], N[18 %] T oronto G[75%], W[58%], L[3 4 %] G[94%], W[88%], L[87%], E[4%], N[1%] T able 3: Dominan t dis tributions as Best Fits [By % Deviation KS-T est.] 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%)  Connecticut Traffic Exponential Gamma Loglogistic Normal Weibull (a) Connecticut(L) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Connecticut Traffic Exponential Gamma Loglogistic Normal Weibull (b) Connecticut(M) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Connecticut Traffic Exponential Gamma Loglogistic Normal Weibull (c) Connecticut( H) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) London Traffic Exponential Gamma Loglogistic Normal Weibull (d) London(L) 0 0.1 0.2 0.3 0.4 0.4 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) London Traffic Exponential Gamma Loglogistic Normal Weibull (e) Lond on(M) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%)  London Traffic Exponential Gamma Loglogistic Normal Weibull (f ) London(H) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Sydney Traffic Exponential Gamma Loglogistic Normal Weibull (g) Sydney(L) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%)  Sydney Traffic Exponential Gamma Loglogistic Normal Weibull (h) Sy dney(M) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Sydney Traffic Exponential Gamma Loglogistic Normal Weibull (i) Sydney(H ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Toronto Traffic Exponential Gamma Loglogistic Normal Weibull (j) T oron to(L) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Toronto Traffic Exponential Gamma Loglogistic Normal Weibull (k) T oron to(M) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 0.2 0.4 0.6 0.8 1 Data CDF (CI 95%) Toronto Traffic Exponential Gamma Loglogistic Normal Weibull (l) T oron t( H) Figure 7: Cumulativ e plot for three v arying traffic in tens ities captured p er cit y . The indivi d ual flows are c haracterized b y the Lo w(L), Me dium(M) and High(H) traffic intensities. 7 congested intersection for the same camera as captured by the density parameter ( d ). Exponential Gamma Loglogistic Normal Weibull 0 20 40 60 80 100 Distribution Model Avg. % [1 st Best Fit] <3%: 49% <5%: 89% <3%: 0.6% <5%: 5.9% <3%: 17% <5%: 70% <3%: 42% <5%: 70% <3%: 0% <5%: 0.9% Figure 8: The p ercenta g e o f distributio n that co ver cameras from all four cities. The v alues i n the b ox show p ercenta ge de vi ation error from empirical data. 5.1 T raffic Flow Characterization In order to inv e stigate the nature of traffic we take a holistic approach to systema tica lly extract individua l and aggreg ate flows of the tr affic densities from the im- ages. E a ch individual flow constitutes a distribution of traffic densities tha t demonstr ate the flow of traffic as viewed fr om an individual camera. This helps us to b et- ter understand tr a ffic intensit y at a microsco pic level of each intersection. The aggr egate tra ffic combines the flows from all the camera in timely or dered fashion. The main adv antage from analyzing ag grega te traffic is to understa nd the emergent prop erties and helps to mo del and profile the cit y and make intelligen t guesses ab out different city based on this a g grega te. On ana ly zing the tr affic, an imp ortant activity to fac- torize the granularit y o f tr affic for v ar ious purp os e . F or example, ho urly patter ns provides a go o d es timate o n the nature of c ongestions during morning and evening times which o therwise flo w at individual density level may not depict. On the other hand, the finer g ranular- it y helps to understand sudden spikes in the traffic flow and congestion mitigation plan. In this w ork, w e choose to lo ok into a ll thes e patterns b y modeling flows agains t well k nown probability distributio ns. Fig 5 gives an ex- ample of the traffic de ns it y on hourly basis for one of the camera in Sydney . W e can observe that there is in gen- eral high traffic density during the p eak ho ur s and low traffic density b etw een 1 0am and 2pm (off p ea k time) which provides po sitive confir ma tion tha t o ur algorithm can effectively detect traffics. Fig. 7 shows the c um ulative density function of the traffic fo r three individual cameras in ea ch city , with low, medium and high av er age traffic. W e can see that traffic at individual camer as c a n v ary a lot, but in gen- eral Log -Logistic, Gamma and W eibull distr ibution can capture so me of the key features of the data. Lo g- logistic is the b est a pproximation for the individual camera tra ffics in all the four cities, and we further shows the detail s tatistics of the fitting in T able-2 that bes t fits, which had shown least o rder of devia tion ag a inst KS-test. In T able-3 , we meas ure the deviation from empirical data and sample the c amera at 3% and 5% err o r levels. In Fig.-8, results show the av era ge dominance of each of four distribution. W e find that even on individual aggre g ation lev el, the lo g logistic distribution provides a go o d estimate for empirical data. As evident, Loglo- gistic a nd Gamma close ly matc hes the empirical data distribution. Finally , Fig. 6 shows the cumulative s tatistics for the aggre g ated tra ffic for each cit y . W e ca n obs erve that dif- ferent cities have different agg regated traffic, for e xam- ple we can s ee that Lo ndon in genera l has mor e traffic than Connecticut. 5.2 Long R ange Dependen ce In [9, 12], authors demonstr ate the ex istence of long range dep endence and self-similar nature of ether net traffic, which has serious implications o n the design and analys is of computer netw or ks. Ins pired by this study o n the a r riv al pro ces s of ethernet pack ets in wired net works, we a lso characterize the nature of v ehicular traffic and inv estigate long r ange dep endence. Self- similarity means that aggreg ate traffic statistics s how long range dependence and the cor r elation decays less than exp onential. In Fig-9(a-d), we show time ser ies plots for four different chronological resolution of inter- v als for the city of Sydney . Initially , we plotted with a time in ter v al unit of one min ute. The subsequent plots come from their previous plots but with one less or- der of resolution of time int erv al. A significant bur s t is o mni-present from finer to mo s t abs tract time res o- lutions. W e als o obser ved this b ehavior in other c ities and we will further investigate in the future work by using differen t type of Hurst estimation[10]. 6. CONCLUSION A ND FUTURE WORK In this pap e r w e introduced a nov el metho d to collec t large-s cale vehicular netw ork da tasets using the alwa ys av ailable online traffic w eb c a ms. Thes e webcams are al- ready deploy e d by gov er nments, companies, or priv ate and he nc e it is an inexp ensive way for da ta collection. They pr ovide 2 4 hours monitoring o n the data collection po ints and hav e refre s h rate as high as seconds, whic h is very desirable for fine gra ined data collection. W e col- lected 7.5 TB of vehicular ima ge data fro m mo r e than 4,500 ca mer as distr ibuted in 10 cites ov er 4 continen ts. W e believe these large a mount of data will b e very im- po rtant for mo bile netw ork resear chers to understand the dynamics of the global cities and as a key step to realistic mo del v ehicular c o mmunication netw ork s. Our 8 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0 10000 20000 Chronological Time (1 min, scale=10 6 ) (a) Density 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0 10000 20000 Chronological Time (10 min, scale=10 5 ) (b) Density 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 0 5000 10000 Chronological Time (100 min, scale=10 4 ) (c) Density 0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 5000 Chronological Time (1000 min, scale=10 3 ) (d) Density Figure 9: T raffic density at different ti me scale on the Sydney dataset. 9 results strongly s uggest a r evisit to the genera l case of exp onential pattern as mo deling distribution for the v e- hicular traffic. Finally , the implication of long range depe ndence indicate the effect o f traffic on the infra s- tructure of road net works. Acknowledgmen t W e are thankful to Geor gios Smar a gdakis, Harold Chi Liu, Ma ria Gonzalez Garcia , Ranjan Pal and Shiv a Sun- daram for their insightful co mments. 7. REFERENCES [1] F. Bai and B. Krishnamachari. Spatio-temp ora l v ariations of v e hic le traffic in v anets: facts and implications. In Pr o c e e dings of the sixth ACM international workshop on V ehiculAr InterNETworking , V ANET ’09, pa ges 43–52 , New Y ork, NY, USA, 200 9. ACM. [2] F. Bai, N. Sadagopa n, and A. Helmy . Impor tant: A framework to systematically analyze the impact of mobility on p er fo rmance of r o uting pr oto cols for adho c netw orks. In IN FOCOM , 20 03. [3] L. Briesemeis ter , L. Schafers, and G. Hommel. Disseminating messages a mong highly mo bile hosts based o n inter-vehicle communication. In Intel ligent V ehicles Symp osium, 2000. IV 2000. Pr o c e e dings of t he IEEE , page s 522 –527, 2000. [4] V. Bychk ovsky , B. Hull, A. K. Miu, H. Balakrishnan, and S. Madden. A Measurement Study of V ehic ula r In ter net Acces s Using In Situ Wi-Fi Net works. In 12th ACM MOBICOM Conf. , Los Angeles, CA, September 20 06. [5] R. E. Chandler, R. Herman, a nd E . W. Mon troll. T raffic dynamics: Studies in car following. OPERA TIONS R ESEARCH , 6(2):165 –184 , 19 5 8. [6] A. Elg a mmal, D. Harwoo d, and L. Davis. Non-parametr ic mo del for ba ckground subtraction. In D. V er non, e dito r, Computer Vision ECCV 2000 , volume 1843 of L e ctur e Notes in Computer S cienc e , pages 751–76 7. Springer B erlin / Heidelber g , 2 000. [7] A. Halati, H. Lieu, and S. W alker. CORSIM-corr idor traffic simulation mo del. In Pr o c e e dings of t he T ra ffic Congestion and T r affic Safety in the 21st Centu r y Confer enc e , pages 570–5 76, 1997. [8] P . Hui, R. Mortier, M. Pi´ orko wski, T. Henderso n, and J . Crow cr o ft. P lanet-scale h uma n mobilit y measurement. In Pr o c e e dings of the 2nd ACM International Workshop on Hot T opics in Planet-sc ale Me asur emen t , HotPlanet ’10, pages 1:1–1:5 , New Y o rk, NY, USA, 2 010. ACM. [9] W. E. Leland, M. S. T a qqu, W. Willinger , and D. V. Wilso n. O n the self-simila r nature o f ethernet tr affic (extended version). IEEE/A CM T r ans. Netw. , 2 :1–15 , F ebruary 1994. [10] R. Lienhart and J. Maydt. An extended set o f haar-like features for r apid o b ject detection. In Image Pr o c essing. 2002. Pr o c e e dings. 2002 International Confer enc e on , volume 1, pages I–900 – I–90 3 vol.1, 2 002. [11] N. E . Lownes and R. B. Machemehl. Vissim: a m ulti-parameter sensitivity analysis. In Pr o c e e dings of t he 38th c onfer enc e on Winter simulation , WSC ’0 6, pages 14 0 6–14 13. Win ter Sim ulation Conference, 2006. [12] B. Mandelbr o t a nd J. W. V an Ness. F ractional Brownian Motions, F ractional No ises and Applications. SIAM R eview , 1 0 (4):422– 437, 1 968. [13] Q. Meng and H. L. K ho o. Self-similar characteristics of vehicle arriv al pattern o n highw ays. J ournal of T r ansp ortation Engine ering , 135(11 ):864–8 72, 2009 . [14] J. Ott and D. Kutscher. Drive-thru internet: Ieee 802.11 b for ” automobile” users. In I N FOCOM 2004. Twenty-thir d AnnualJoint Confer enc e of the IEEE Computer and Communic ations S o cieties , volume 1, pages 4 vol. (xxxv+2 866), march 20 04. [15] M. Picca rdi. Background subtr a ction tec hnique s : a r eview. In Systems, Man and Cyb ernetics, 2004 IEEE International Confer enc e on , volume 4, pages 3 099 – 31 04 vol.4, o ct. 20 04. [16] J. Singh, N. Bambos , B. Sriniv asan, and D. Cla win. Wire le ss lan p erfor mance under v ar ied stress co nditions in vehicular traffic scenarios. In V ehicular T e chnolo gy Confer enc e, 2002. Pr o c e e dings. VTC 2002-F al l. 2002 IEEE 56th , volume 2, pages 74 3 – 74 7 vol.2, 200 2. [17] C. Stauffer and W. Grimson. Adaptiv e background mixture mo dels for r eal-time tracking. In Computer Vision and Patt ern Re c o gnition, 1999. IEEE Computer S o ciety Confer enc e on. , volume 2, pages 2 vol. (xxiii+63 7+66 3), 1999. [18] Z. Sun, G. Bebis, and R. Miller. On- road v e hic le detection: A review. IEEE T r ansactions on Pattern Analysis and Machine Int el ligenc e , 28:694 –711 , 2006. [19] N. Wisitp ong phan, F. Ba i, P . Mudalige, V. Sadek ar, and O. T onguz. Routing in s pa rse vehicular ad ho c wireless netw orks. Sele cte d Ar e as in Communic ations, IEEE Journal on , 25(8):153 8 –1556 , o ct. 2007. [20] J. Y eo, D. Kotz, and T. Henderson. Cr awdad: a communit y resource fo r archiving w ir eless da ta a t dartmouth. SIGCOMM Comput. Commun. R ev. , 36:21– 22, April 2006. [21] X. Zha ng, J. Kur ose, B. N. Levine, D. T owsley , and H. Zhang . Study of a bus-based disruption-toler a nt netw or k: mobilit y mo de ling 10 and impa c t on ro uting. I n Pr o c e e dings of the 13th annual ACM international c onfer enc e on Mobile c omputing and networking , MobiCom ’07, pages 195–2 06, New Y o rk, NY, USA, 2 007. ACM. 11

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment