Transit Network Design with Two-Level Demand Uncertainties: A Machine Learning and Contextual Stochastic Optimization Framework
Transit Network Design is a well-studied problem in the field of transportation, typically addressed by solving optimization models under fixed demand assumptions. Considering the limitations of these assumptions, this paper proposes a new framework,…
Authors: Hongzhao Guan, Beste Basciftci, Pascal Van Hentenryck
T ransit Net w ork Design with T w o-Lev el Demand Uncertain ties: A Mac hine Learning and Con textual Sto c hastic Optimization F ramew ork Hongzhao Guan 1 [0000 − 0002 − 5006 − 1165] , Beste Basciftci 2 [0000 − 0002 − 3876 − 2559] , and P ascal V an Hen tenryck 1 [0000 − 0001 − 7085 − 9994] 1 H. Milton Stew art Sc ho ol of Industrial and Systems Engineering, Georgia Institute of T ec hnology , A tlanta GA 30345, USA {hguan7, pvh}@gatech.edu 2 Tippie College of Business, Univ ersity of Io wa, Iow a City IA 55242, USA beste-basciftci@uiowa.edu Abstract. T ransit Net work Design is a w ell-studied problem in the field of transportation, t ypically addressed b y solving optimization models under fixed demand assumptions. Considering the limitations of these assumptions, this paper prop oses a new framework, namely the T w o- Lev el Rider Choice T ransit Netw ork Design (2LR C-TND), that lev erages mac hine learning and contextual sto chastic optimization (CSO) through constrain t programming (CP) to incorp orate tw o lay ers of demand uncer- tain ties into the netw ork design pro cess. The first lev el identifies tra velers who rely on public transit (core demand), while the second level captures the conditional adoption b ehavior of those who do not (latent demand), based on the av ailability and quality of transit services. T o capture these t wo types of uncertain ties, 2LR C-TND relies on tw o tra vel mo de choice mo dels, that use multiple mac hine learning models. T o design a netw ork, 2LR C-TND in tegrates the resulting choice models into a CSO that is solv ed using a CP-SA T solv er. 2LR C-TND is ev aluated through a case study inv olving ov er 6,600 trav el arcs and more than 38,000 trips in the A tlanta metrop olitan area. The computational results demonstrate the effectiv eness of the 2LRC-TND in designing transit net works that ac- coun t for demand uncertainties and contextual information, offering a more realistic alternativ e to fixed-demand models. Keyw ords: Constraint Programming · Con textual Stochastic Optimiza- tion · T ransit Netw ork Design · T ra vel Behavior 1 In tro duction T ransit Netw ork Design is an imp ortan t problem in transp ortation and urban planning, that fo cuses on dev eloping effectiv e public transp ortation services. T ransit agencies, whic h operate transit netw orks, may need to redesign their sys- tems when new technologies emerge or when the urban en vironment undergoes c hanges. A common approach to re-design a transit net work cons ists in solv- ing an optimization problem that leverages existing transit demand, typically represen ted by Origin-Destination (O-D) pairs. How ev er, this metho d often has 2 H. Guan et al. difficulties to accoun t for the c hanges in demand, i.e., the n umber of riders who will actually use the system after the netw ork re-design. F or a transit agency , relying solely on transit demand of the existing system is insufficient b ecause the new system ma y dra w additional riders, potentially leading to ov ercrowding and p o or service quality . On the other hand, ov erestimating transit demand can result in an inefficient and costly design. The transit design can then be thought of as a game b etw een the transit agency and riders, where the transit design influences ridership, whic h in turn determines the quality and cost of the sys- tem. At this point, the critical issues b ecome how to accurately capture trav el b eha viors from av ailable data and contextual information then integrate them in to optimization models. T o address these c hallenges, this pap er provides a no vel framework to mo del the transit net work design problem, namely the T wo-Lev el Rider Choice T ran- sit Netw ork Design (2LR C-TND). The 2LRC-TND framew ork aims at finding a netw ork design as the equilibrium p oint betw een the transit agency and its p oten tial riders. T o find this equilibrium p oin t, 2LRC-TND introduces a contex- tual sto c hastic optimization (CSO) approach that learns contextual probability distributions to capture rider b eha vior effectively . Core Riders Current T ransit Riders with Other Choices Current T rav elers not using T ransit Current T ransit Demand Core Demand for 2LRC-TND Latent (or Poten tial) Demand for 2LRC-TND Fig. 1: A Diagram depicting the demand structure in 2LR C-TND. More precisely , for representing trav el demand, 2LR C-TND considers tw o lev els of uncertaint y in capturing core and laten t demand, as shown in Figure 1. The first level fo cuses on identifying “core demand”, i.e., trav elers who rely on public transit as their primary tra vel mo de and must b e served by the transit system; these riders need to b e iden tified from the existing transit demand. The second level captures the “latent demand”, i.e., p oten tial riders who ma y c ho ose to use transit if the system provides go o d service quality; these riders ha ve alternativ e trav el options and will only adopt transit if the quality meets their needs. Th us, some of these p oten tial riders are already part of the curren t transit demand, while others are currently rejecting the existing system and using other trav el mo des. By com bining these tw o levels of uncertain ty , the proposed framew ork introduces a no vel approach to represent v ariable demand. Compared to prior work in transit netw ork design, the 2LRC-TND frame- w ork in tro duces no vel contributions along three axes. First, it incorp orates t wo lev els of demand uncertaint y; this contrasts with studies, such as [13], which divide trip demand into tw o fixed subsets: the core transit users and the latent demand. Imp ortan tly , 2LRC-TND treats the iden tification of these tw o sets as a classification problem. Secondly , to account for uncertain ties in the netw ork design optimization, 2LR C-TND mo dels individual decisions as probabilit y dis- tributions rather than as fixed v alues. Lastly , 2LRC-TND is a flexible and generic framew ork that in tegrates learning and optimization in a w ay that is applicable to v arious t yp es of netw ork design problems b ey ond transit systems. 2LR C-TND 3 F rom a technical standpoint, 2LR C-TND framew ork leverages CSO and the Sample A verage Approximation (SAA) metho ds to explicitly incorp orate behav- ioral uncertain ty in to the transit netw ork design optimization mo del. Under this setting, m ultiple types of machine learning techniques are used to model riders’ c hoices. The optimization comp onent of 2LR C-TND uses CP-SA T to maximize total system cov erage while resp ecting budget and op erational constrain ts. T o demonstrate the practicalit y of 2LR C-TND, this pap er rep orts a compre- hensiv e real-world case study using ov er 38,000 observ ed trips from the Atlan ta metrop olitan region. The results show that, within a given budget, 2LR C-TND can fully serve core trip demand while attracting additional riders, leading to a ridership increment with up to 30% higher adoption. This show cases the prac- ticalit y and potential so cial impact of 2LRC-TND for transit agencies that aim at implementing data-driven planning solutions. In summary , the con tribution of this pap er can b e characterized as follo ws: 1. This pap er proposes 2LRC-TND, a no vel transit net work design framew ork that captures multiple sources of uncertain ty in a hierarc hical wa y . 2. T o solv e 2LRC-TND applications, the pap er introduces a (widely applicable) CSO approach that in tegrates mac hine learning and stochastic optimization. 3. Computational results from a large-scale case study with real data show that 2LR C-TND is b oth practical and b eneficial for real-world transit netw ork planning. 2 Related W ork The T ransit Netw ork Design problem has b een extensiv ely studied and is widely recognized as an essential topic for societal dev elopment [16, 11, 12]. A common approac h to design a transit net w ork is by solving an optimization problem, whic h is also the approach used in this pap er. T o help readers b etter understand the con text of this w ork, it is useful to categorize existing studies based on ho w researc hers mo del transit demand. Ma jorit y of studies in this field assume demand is fixed [9, 11, 23, 25, 8], while others consider uncertain or v ariable demand that resp onds to the proposed netw ork design [19, 20, 10, 4, 5, 13]. This pap er belongs to the latter category and adopts more complex yet more realistic assumptions regarding uncertain demand by filling this critical research gap in this area. A general approac h to modeling transit demand is tra vel mo de choice mod- eling, whic h predicts who will use transit and when, and is a key research focus in transp ortation. These studies primarily rely on surv ey data collected from individuals. Common mo deling tec hniques range from logit mo dels [21] to tra- ditional machine learning mo del [28, 30], as w ell as more adv anced Deep Neural Net works (DNN) [22]. This pap er dev elops its choice mo dels by extending the insigh ts and findings from the studies discussed abov e, where the existing studies in this area do not consider the subsequen t net work design problem while solely fo cusing on learning rider b eha vior. 4 H. Guan et al. T able 1: Notations Notation Definition Key T erms : Rider A person who takes transit services. T rip r Defined by an O-D pair and a group of e r riders who trav el together and share the same trav el decisions. Sets : T The set of all trips (O-D Pairs). T i core , T i latent The partition of set T into T core and T latent under eac h scenario i . N The set of locations for p otential transit stops. M The set of transit mo des. Default: { rail , bus } . P r The set of considered paths for trip r . A , A p The set of all arcs and arcs b elong to path p . A f ixed The set of all fixed arcs. A + n,m , A − n,m The set of out-arcs and in-arcs for no de n with mo de m . A n 1 ,n 2 ,m The set of arcs from node n 1 to n 2 with mode m . Z The set containing feasible transit network designs. Decision V ars. : z a Binary v ariable indicating if transit arc a is op ened in the net work design. u i r Binary v ariable indicating if trip r uses the transit system, under scenario i . d i r Binary v ariable indicating if trip r adopts the transit system, under scenario i . f p Binary v ariable indicating if path p is feasible, dynamically based on z . Param eters : I , I ′ Number of SAA scenarios for solving and ev aluation, respectively . B Budget of transit agency for a planning horizon (e.g., a da y). e r Number of riders in trip r who share the same trav el behaviors. s a Cost of op erating arc a . h a Number of buses in an hour of arc a . c i r Indicating if trip r is a core trip under scenario i . d i r,p Indicating the adoption decision of trip r on path p under scenario i . l r Penalt y term when ev aluating a netw ork design for not serving a trip r . w r Amount of paths that are considered for trip r . V ectors : z V ector of z a ov er arcs a ∈ A , represen ting a transit netw ork design. x r Contextual information v ector for trip r . The 2LR C-TND problem explored in this paper combines the tw o previously discussed components—a machine learning problem and an optimization prob- lem. Generally sp eaking, it is a decision-making problem that incorp orates pre- dictiv e models [6], and a recently emerging framework for addressing suc h prob- lems is CSO whic h considers con textual information and learning approac hes in represen ting uncertaint y within the optimization problems [24]. The core ideas of these studies ha ve been successfully applied across v arious domains, such as route planning in transp ortation [26], order fulfillmen t [29] and in ven tory control [7, 27] in supply chain management, shelter location selection in disaster man- agemen t [17], and public school redistricting in comp utational so cial science [15]. The next section will formally introduce the 2LRC-TND problem and establish its connection to the CSO framework. 3 Problem Description and Metho dology This section presen ts the 2LRC-TND problem that integrates tw o lev els of de- mand uncertainties within a sto c hastic optimization framework. F rom a tec h- nical persp ective, 2LRC-TND adopts CSO as a metho dological framework and emplo ys Constrain t Programming (CP) to model and solve the resulting op- timization problems. The CP mo del aims at maximizing the exp ected transit co verage b y making decisions on which bus arcs to activ ate or deactiv ate, while satisfying sev eral constraints. It is imp ortan t to note that the primary goal of this 2LR C-TND 5 study is to demonstrate the practicalit y of the prop osed CSO-based framework. The transit net w ork design problem can be adapted to optimize other ob jectives, suc h as cost and qualit y of service, as long as the form ulation remains compatible with the structure of the framework. T able 1 summarizes the notations. 3.1 Demand Uncertaint y T o accurately capture demand uncertainties, t wo k ey questions m ust be an- sw ered. The first question identifies those riders who needs to use public transit on a daily-basis, i.e., they hav e no alternativ e tra vel options. F ollo wing this ques- tion, there are tw o demand sets that can b e defined: T core and T latent . The core set T core captures those riders who m ust use transit, while the latent demand set T latent con tains those who hav e options. T rips in T latent will adopt the transit system only when certain criteria are met. T ogether, they form the trip set T . Therefore, the first level of demand uncertaint y lies in ho w T is divided into T core and T latent . Once the trips in T core are identified, the transit agency m ust pro vide them with appropriate service. The second question considers the laten t set T latent and asks when these riders c ho ose to use transit instead of other modes? These riders will lik ely make a decision on adopting or rejecting transit based on the tra v el path presented to them, and this is the second level of demand uncertain ties. In con trast to T core , the transit agency is not obligated to serve all riders in T latent ; how ev er it aims at attracting as many of them as p ossible b y offering them high-qualit y service. With the tw o aforementioned questions in mind, all adoption decisions are defined as random v ariables coming from (unknown) distributions, whic h will b e learned from contextual information. The con text x r for rider r t ypically includes demographic, geographic, and trip-sp ecific information. Consider tw o binary random v ariables: (1) C r whic h is 1 if trip r is a core trip and 0 other- wise, and (2) D r is 1 if latent trip r adopts the system, and 0 otherwise. The distributions of these random v ariables are learned b y machine learning models giv en the con textual information. Let U r b e random v ariable indicating whether trip r utilizes the transit system, i.e., U r = 1 , if C r = 1 , 1 , if C r = 0 and D r = 1 , 0 , if C r = 0 and D r = 0 , (1) 3.2 Net work Design with Contextual Sto c hastic Optimization This pap er defines the 2LRC-TND problem ov er a directed multi-graph G = ( N , A ) . The set of no des N includes all locations, where each n ∈ N represents a p oten tial stop for transit service. The set of arcs A con tains all allow ed con- nections b et ween pairs of locations. Eac h arc a ∈ A is asso ciated with an origin, a destination, a transit mo de a m suc h as bus and rail, and a frequency h a (for example, if the in terv al betw een tw o consecutive buses on this arc is 10 minutes, then h a is 6 buses p er hour). The decision v ariable z a indicates whether arc a ∈ A is op ened. The v ector z collectively represents the netw ork design, and 6 H. Guan et al. the set Z con tains all feasible netw ork designs. F urther mo deling assumptions and details are discussed in subsequent sections. 2LR C-TND aims at maximizing exp ected transit cov erage (or ridership) whic h can be represented in the follo wing compact form: max z ∈Z E U ∼ P ( U | z , x ) g ( U ) , (2) where the function g ( U ) represents the transit cov erage amount under netw ork design z and ridership U as defined for ev ery trip r in (1), and the conditional distribution P ( U | z , x ) corresp onds to the adoption behavior of riders based on the all trips’ context information x and netw ork design z . The transit cov erage is defined as g ( U ) = P r ∈T e r · U r (3) where e r is the set of riders asso ciated with trip r . Assuming that eac h trip r mak es its adoption decision indep enden tly from other trips, the resulting CSO can b e expressed as max z ∈Z P r ∈T e r · E U r ∼ P ( U r | z , x r ) [ U r ] . (4) 2LR C-TND uses the Sample A verage Appro ximation (SAA) metho d [18] to appro ximate the CSO problem (4). This approac h generates I indep enden t and iden tically distributed scenarios b y sampling from the distribution P ( U r | z , x r ) . Let SAA i r ( z ) represen ts the adoption decision made by trip r ∈ T under the net work design z in scenario i ∈ I . Then, the 2LR C-TND problem in (4) can b e reform ulated as follo ws: max z ∈Z 1 I I P i =1 P r ∈T e r · SAA i r ( z ) . (5) 3.3 Decisions and P aths T o estimate P ( U r | z , x r ) , 2LR C-TND uses probabilistic choice models C core ( x r ) and C adopt ( x r , p ) that capture the distributions of C r and D r , resp ectiv ely . In general, b oth C core and C adopt are trained mac hine learning mo dels based on the contextual information and transit paths suggested to potential riders. Both mo dels capture the probabilities asso ciated with the p ositive labels. Then based on the probabilities, model C core and C adopt can b e further used for classification task, where C core decide whether a trip r is a core trip (lab el 1) or a latent trip (lab el 0) and C adopt classifies whether a latent trip r would adopt (1) or reject (0) a given transit path p . F ollowing this, C r can b e modeled as: C r = ( 1 , if C core ( x r ) = 1 , 0 , otherwise. (6) T o model C adopt , define a transit path p as a route that links the origin and destination of a trip through transit services. Each path p typically includes tw o 2LR C-TND 7 w alking trip-legs and one or more transit trip-legs. Consider the netw ork z where all the arcs (e.g., bus routes) are op ened, and define, for each trip r ∈ T , the set P r represen ting the w r paths with the shortest trav el times, where w r is a small n umber. F or eac h trip r ∈ T latent , the c hoice model C adopt ( x r , p ) takes t wo inputs, the con textual information x r and information associated with a transit path p (e.g., the num b er of transfers and the transit time); it outputs the adoption decision for trip r . The intuition b ehind C adopt is that individuals decide whether to use transit based on the a v ailabilit y of a fav orable path. With these definitions, D r is mo deled as: D r = 1 , if W p ∈P r ( f easibl e ( z , p ) ∧ C adopt ( x r , p ) = 1) , 0 , otherwise (7) Here, f easible ( z , p ) holds if path p is feasible under the netw ork design z , whic h means if all transit trip-legs in path p are op en. Therefore, follo wing Equation (7), trip r selects transit under design z if there exists a feasible path p that r adopts. Note that, for latent trips, riders are assumed to consider only the paths in P r , as these represent the fastest options. Any path with longer tra vel times are treated as rejections. Both C core and C adopt are treated as black-boxes within the 2LRC-TND framew ork and can range from complex models, such as deep neural netw orks, to simple rule-based approaches. Imp ortan tly , under the blac k-b o x setting, even without high-quality or sufficien t data to construct detailed con textual features x r , one can still develop reasonable rule-based mo dels based on domain exp ertise and exp eriences to mak e the 2LRC-TND framew ork function effectively . The ov erall 2LRC-TND framework can thus b e summarized as follo ws. Giv en the set of trips T , for eac h scenario, the first step iden tifies the core riders. Each core trip r must b e serv ed, whic h means at least one path in P r is av ailable under the netw ork design. The remaining trips are classified as laten t in that scenario. F or eac h laten t trip, 2LR C-TND c hecks whether any path in P r is av ailable and adopted. If not, trip r contin ues using other trav el mo des. 3.4 The Constraint Programming Mo del This study adopts CP as the mo deling approac h due to its strengths in handling complicated logical relationships. The prop osed mo del includes man y AND and OR operations, whic h w ould require extensive linearization if implemented using Mixed-In teger Programming (MIP), significantly increasing the complexity of the mo del. Building on the concepts in tro duced earlier, Figure 2 presen ts the CP mo del. The ob jective function (8a) sums ov er the I SAA scenarios to compute the exp ected ob jectiv e, where u i r denotes the final adoption decision of trip r under scenario i . The first group of constraints link z a with the adoption decisions. Con- strain ts (8b) and (8c) implemen t Equations (1) and (7), where c i r and d i r,p are sampled, using the choice functions C core ( x r ) and C adopt ( x r , p ) , respectively . The definition of T i latent and T i core are based on c i r : T i latent = { r ∈ T : c i r = 0 } and 8 H. Guan et al. max 1 I I X i =1 X r ∈T e r · u i r (8a) s . t . u i r = ( 1 ∀ r ∈ T i core , i ∈ [ I ] d i r ∀ r ∈ T i latent , i ∈ [ I ] (8b) d i r = _ p ∈P r ( f p ∧ d i r,p ) ∀ r ∈ T i latent , i ∈ [ I ] (8c) z a ≥ f p ∀ a ∈ A p , p ∈ P r , r ∈ T (8d) f p ≥ X a ∈A p z a − |A p | + 1 ∀ p ∈ P r , r ∈ T (8e) c i r ≤ X p ∈P r f p ∀ r ∈ T i core , i ∈ [ I ] (8f ) B ≥ X a ∈A s a · z a (8g) z a = 1 ∀ a ∈ A f ixed (8h) 0 = X a ∈A + n,m h a · z a − X a ∈A − n,m h a · z a (8i) ∀ n ∈ N , m ∈ M 1 ≥ X a ∈A n 1 ,n 2 ,m z a ∀ n 1 , n 2 ∈ N , m ∈ M (8j) z a ∈ { 0 , 1 } ∀ a ∈ A (8k) u i r ∈ { 0 , 1 } ∀ r ∈ T , i ∈ [ I ] (8l) d i r ∈ { 0 , 1 } ∀ r ∈ T i latent , i ∈ [ I ] (8m) f p ∈ { 0 , 1 } ∀ p ∈ P r , r ∈ T (8n) Fig. 2: The Constraint Programming Model. T i core = { r ∈ T : c i r = 1 } . Constrain ts (8d) and (8e) together define f p dynami- cally based on z , whic h is a linearized v ersion of f easible ( z , p ) . Constraint (8d) ensures that path p is infeasible if an y of its transit arcs is closed. Constraint (8e) ensures that path p is set as feasible when all of its transit arcs are op en. Us- ing f p , Constraint (8f) ensures that at least one path is feasible for eac h core trip, thereb y guaranteeing that the core riders hav e access to the service. A t this stage, solving the mo del does not explicitly determine which path each trip will use; how ever, the solution reveals the set of feasible path options a v ailable to eac h trip and iden tifies which laten t trips adopt the new system. Constrain ts (8g)–(8j) fo cus on the modeling of the transit netw ork itself and incorp orate several realistic assumptions commonly found in transit netw ork design. It is imp ortan t to note that these constrain ts are indep enden t of the SAA scenarios. First, Constrain t (8g) ensures that the net work design remains within the op erational budget B , where s a denotes the cost of op erating arc a o ver the planning horizon (e.g., a typical weekda y). Secondly , Constraint (8h) enforces that all arcs in the set A f ixed remain op en. The set A f ixed t ypically 2LR C-TND 9 represen ts existing infrastructure that the transit agency in tends to preserv e in the new system, such as underground sub w ay lines. Next, Constraint (8i) ensures the flow balance for each no de in N and eac h mo de in M , where A + n,m and A − n,m include arcs that go in and out of node n with mode m , resp ectively . In this pap er, M by default is { bus, r ail } ; how ever, one can easily customize it. Lastly , Constraint (8j) assures that an y tw o lo cations n 1 and n 2 are only connected by one arc p er mo de, where A n 1 ,n 2 ,m includes all arcs that start from n 1 to n 2 with mo de m . Note that the output consists of a set of connected arcs b et w een no des, whereas in real-world settings, tra velers use complete bus lines rather than na vigating arc b y arc. P ost-pro cessing metho ds can be employ ed to con vert the selected arcs into op erational bus lines. 4 The Atlan ta Case Study This section presents a case study based in A tlanta, Georgia, USA. In summary , b y solving a large-scale 2LRC-TND problem, this case study redesigns a transit system with 38,179 trips ( T ) in the city of A tlanta, where services are curren tly op erated b y the Metrop olitan Atlan ta Rapid T ransit Authority (MAR T A). T o ease the pro cesses of constructing meaningful datasets, the en tire case study is conducted within a rectangular area (see Figure 4a) that closely aligns with the I- 285 interstate and MAR T A’s service area. Figure 3 illustrates ho w the case study fits within the 2LRC-TND framework. This case study utilizes three datasets: Sur vey-Arc for training C core , L-Ar c-9 for training C adopt , and L-Ar c-8 for ev aluating the complete 2LR C-TND framework. ML Part ² Dataset SUR VEY-ARC Ô C core T rain a ML Model (e.g., A Random F orest) with contextual info. ² Dataset L-ARC-9 Ô C adopt T rain a ML Model (e.g., A Neural Netw ork) with contextual info. SAA under Scenario i ² 6,659 transit trips in Dataset L-Ar c-8 ² 31,520 driving trips in Dataset L-Ar c-8 + ² T ransit trips with other c hoices ² Core trips T i core Ó v Ó Paths of T i latent & Adoptions d i r,p Ó vv Paths of T i core Apply C core ² Latent T rips T i latent Collect paths Apply C adopt Collect paths SAA with I ′ Scenarios SAA with I Scenarios ø CP Mo del ¨ A Network Design solving ev aluation with additional SAA scenarios CP Part Fig. 3: A Flo wc hart showing the Atlan ta case study within the 2LR C-TND frame- w ork. Blue indicates that these comp onen ts change in eac h SAA scenario. 4.1 Construct C core and C adopt The model C core predicts the probabilit y of a trip r b eing a core trip. It is trained on the Sur vey-Arc dataset, which is constructed from a 2019 transit 10 H. Guan et al. rider surv ey conducted b y the Atlan ta Regional Commission (ARC) [1]. The ground-truth lab els contain 13,607 core trips (class 1) and 5,233 transit trips with other c hoices (class 0), where each trip represen ts a single rider. Since no prior study has directly addressed core trip classification, this paper adapts this surv ey by defining core trips as those made by comm uters (trips b et ween home and w orks) who cannot tra vel without transit, use transit at least twice weekly , and walk to/from their stops. All other comm uters in the dataset are classified to class 0. The con textual features include demographic c haracteristics (e.g., age, income level), geographic data (e.g., n umber of bus stops within 5 min utes w alking distance from the trip origin), and trav el time information (e.g., trav el time from origin to destination if driving a car). A separate dataset, referred as L-Arc-9 , is used to train the C adopt mo del. The L-Ar c-9 dataset is derived from AR C’s 2020 activity-based sim ulation for the Atlan ta metrop olitan area [2], detailed data processing procedures are do c- umen ted in the App endix of [14]. The original sim ulation study cov ers 5 million p eople and 13 million tours in a regular w orkday , from which L-Ar c-9 selects 15,109 home-to-w ork trips made b et ween 9-10 AM that o verlaps the rectangu- lar area in Figure 4a. This highly im balanced dataset contains 1,019 adopting trips (class 1) and 14,090 rejecting trips (class 0). Adopting trips are made b y comm uters who use transit despite having spare vehicles at home, while reject- ing trips are made b y those who drive alone despite transit a v ailabilit y in their area. Eac h trip represents a single rider and includes demographic, geographic, and trip-related contextual features. Since no existing datasets or surv eys di- rectly capture adoption b eha vior, this dateset provides a practical alternative to resource-in tensive data collection. In summary , the Sur vey-Arc and L-Ar c-9 datasets are used solely for training the choice mo dels. Although the datasets originate from differen t sources, b oth were collected b y the same agency within the same geographic region. Ad- ditionally , to ensure the trips under study follow relatively fixed patterns, the analysis fo cuses on comm uting trips, as they b est represen t consisten t tra vel b eha viors on a regular day . 4.2 Demand Set T , Net work Arcs, and P aths A third dataset, L-Ar c-8 , is in tro duced here to represent the demand used for ev aluating the 2LR C-TND framew ork. A separate ev aluation dataset is necessary to assess the framework’s p erformance on unseen data and a void ov erfitting from the training sets used for the choice mo dels. This dataset is derived from the same AR C sim ulation source as L-Ar c-9 but focuses on all commuters trav eling in the rectangular area during 8-9 A M, the peak morning rush hour. There are 38,179 trips ( T ) in L-Ar c-8 , with each trip having a single rider ( e r = 1 ). Among these trips, 31,520 commute by driving alone, while 6,659 are curren t transit trips. All trips hav e contextual information compatible with the trained C core and C adopt mo dels. F ollowing the steps illustrated in Figure 3, for eac h SAA scenario, the 6,659 existing transit trips are first split in to core trips 2LR C-TND 11 and current adopting trips using C core . The curren t adopting trips and the 31,520 driving trips then form the latent demand for 2LRC-TND. This case study assumes remov al of all existing bus lines while preserving the rail lines (see Figure 4a), as rail infrastructure is not flexible and difficult to mo dify . Within the defined rectangular area, there are 933 stops ( N ), including 38 rail stations. T o construct the set A and keep the CP model c omputationally manageable, when considering p oten tial bus connection for eac h stop, only the fiv e nearest stops based on driving distance are considered. F or each p oten tial connection betw een t wo stops, a fixed frequency of s ix buses p er hour is assumed ( h a = 6 for all arcs). In addition to the standard on-road tra vel time, an extra fiv e min utes is added to account for p eak-hour congestion, waiting time, and other p oten tial dela ys. In total, the netw ork includes 6,004 arcs ( A ), of whic h 692 are fixed rail arcs ( A f ixed ). T o model the op erating cost b et ween 8 a.m. and 9 a.m., the case study assumes a rate of $72.15 p er hour for buses while in motion. Rail op erating costs are dropp ed from the budget B , as all rail arcs are fixed and not sub ject to redesign. The cost estimation and stops information can b e found in an A tlanta case study [3]. T o en umerate the paths, a graph is first constructed considering all arcs in A are op en. F or each trip r , the origin and destination are connected to their fiv e nearest stops base d on walking distance, and the four ( w r ) fastest paths are then selected using shortest-path algorithms to construct the set P r . w r is set to 4 to maintain reasonable mo del size. Additionally , real-world na vigation apps commonly considers a similar n um b er of transit paths for users. When constructing SAA scenarios i , an adoption decision ( d i r,p ) for each path in P r is sampled based on the predicted adoption probability from the pre-determined mo del C adopt . Similarly , c i r are sampled using the predicted probability from C core . After sampling, for laten t trips where c i r = 0 , if all paths in P r are rejecting paths, these trips can b e excluded from the optimization mo del since these trips will reject the netw ork regardless. 4.3 Exp erimen ts: Choice Mo deling and Ev aluating 2LRC-TND The first group of exp erimen ts in this case study fo cuses on ev aluating the p er- formance of C core and C adopt when treated as classifiers. Specifically , C core is ev aluated on the Sur vey-Ar c dataset, while C adopt is ev aluated on L-Arc-9 . Three t yp es of mac hine learning models are used: Multinomial Logit (L), Ran- dom F orest (RF), and Deep Neural Netw ork (DNN). The dataset is randomly split using an 80-20 training-test split. This process rep eats 100 times. F or eac h training-test split, h yp er-parameters are tuned via a grid-searc h 5- fold cross-v alidation of the training set, and the b est configuration is ev aluated on the test set. Sp ecifically , for the Logit model, the regularization parameter C is tested with v alues {0.1, 0.8, 1.0, 10}, maxim um iterations with {100, 500, 2000}, and class weigh t ratios of {1:2, 1:3, 1:5, 1:10} are ev aluated for C adopt to address class imbalance. The Random F orest mo dels test the num b er of estimators with v alues {500, 1000, 2000} and maxim um tree depth with {10, 15, 20, 25}, with the class weigh t ratios of {1:2, 1:3, 1:5, 1:10} also explored for C adopt . The DNN 12 H. Guan et al. arc hitecture emplo ys embeddings for categorical and ordinal v ariables, follow ed b y tw o fully connected lay ers with 32 and 16 neurons resp ectiv ely , each using a drop out rate of 0.2 and the Adam optimizer. Learning rates are tuned with candidate v alues {0.001, 0.01, 0.1, 0.15}, and early stopping is applied with a patience of 20 ep ochs and tolerance of 0.001. The second group of experiments ev aluates the 2LR C-TND framework by comparing it against three b enc hmarks. The first b enc hmark, Fixed-Demand (FD), designs a netw ork using only the 6,659 existing transit riders in a deter- ministic manner, without considering demand uncertain ties. The second bench- mark, Naiv e Rule-Based (RB), with no in volv emen t of Machine Learning, uses hand-crafted rules to p erform the same functions as C core and C adopt . A simple rule-based mo del example is: "If b oth the origin and destination of a trip are within a 5-minute walk distance to a transit station, assign a core trip proba- bilit y of 0.8; otherwise, assign 0.2." In general, a rule-based approac h requires extensiv e domain exp ertise, but the design of such rules is b ey ond the scop e of this pap er. The third b enc hmark, Deterministic, emplo ys the same machine learning mo dels as 2LRC-TND but considers them under a fixed threshold of 0.5 to obtain deterministic decisions and designs the net w ork under this setting. F or each of these runs (2LRC-TND and Benchmarks), tw o budget lev els are tested. These v alues are chosen based on the total cost of activ ating all av ailable arcs from 8AM to 9AM, whic h is approximately $58K. A budget of $30K repre- sen ts roughly half of this total, while $35K corresp onds to 60%, allo wing for a meaningful comparison under different budget constrain ts. All implemen tations for this paper are programmed in Python 3.11, with the supp ort of mac hine learning pack ages PyT orch, Scikit-learn, and imblearn. The CP-SA T solv er from OR-T o ols is used for solving the optimization model, configured to run with 8 threads with 12 hours solving time. All exp erimen ts w ere conducted using the CPUs and R TX-6000 GPUs av ailable on a Linux High P erformance Computing cluster. 4.4 Ev aluation Metrics for Netw ork Designs Net work designs can b e ev aluated by standard transportation metrics, such as the num b er of p eople cov ered and the b est tra vel time offered to riders. This study presents an additional ev aluation approach based on SAA, where a netw ork design is assessed using out-of-sample scenarios (denoted as I ′ and I ′ >> I ), dif- feren t than I scenarios used during net work design. Since a net w ork is optimized only with resp ect to a sp ecific set of scenarios, it may not b e feasible for sce- narios outside that set, i.e., Constrain t (8f) migh t not b e satisfied. F unction E v al ev aluates a netw ork design z by quan tifying the num b er of riders it s uc- cessfully serves and by p enalizing the core trips r with no feasible paths with cost l r . The resulting E v al function can b e represen ted as follo ws by considering the optimization mo del in Figure 2 under the given netw ork design z o ver I ′ out-of-sample scenarios: E v al ( z ) = 1 I ′ I ′ P i =1 P r ∈T e r · ( u i r − l r · max( c i r − P p ∈P r f p , 0)) (9) 2LR C-TND 13 T able 2: Choice mo deling results av eraged o ver 100 runs (100 different training- test split), with standard deviations in paren theses. C core (on Sur vey-Arc Dataset) C adopt (on L-Ar c-9 Dataset) Models F1-score Accuracy AUC-R OC W eighted F1 Accuracy AUC-R OC L 0.844 (0.005) 0.780 (0.006) 0.805 (0.008) 0.912 (0.003) 0.924 (0.004) 0.820 (0.015) DNN 0.866 (0.015) 0.803 (0.016) 0.799 (0.011) 0.907 (0.022) 0.897 (0.036) 0.807 (0.029) RF 0.885 (0.003) 0.821 (0.005) 0.819 (0.008) 0.923 (0.003) 0.933 (0.003) 0.855 (0.015) 5 Computational Results This section begins b y presenting the mac hine learning results from the t w o c hoice mo dels. It then reports the optimization results of the 2LR C-TND frame- w ork applied to the aforemen tioned case study . 5.1 Results on Choice Mo deling T able 2 summarizes the p erformance of c hoice models on classification tasks. While the logit mo del serv es as a baseline due to its simplicity , it still pro duces reasonably consisten t results. T ree-based mo dels, suc h as RF, outp erform the other tw o mo dels. In con trast, DNNs demonstrate comparatively w eaker p erfor- mance across b oth tasks. This under-p erformance is likely attributable to the tabular nature of the data, a context in which DNNs usually struggle. F or RF’s h yp er-parameters, the best configuration for C core is 2000 estimators with a max- im um depth of 25. F or C adopt , RF uses 1000 estimators, a maxim um depth of 10, and a class weigh ting ratio of 1:5 b etw een class 0 and class 1 during training. Ov erall, the results are consistent and exhibit low standard deviation across 100 runs. F or C adopt , the weigh ted F1-score is employ ed to address the highly im balanced nature of the dataset. Giv en this imbalance, mo dels ac hieving a A UC-ROC score ab o ve 0.8 can b e considered strong. Despite the classification p erformance, the models also yield v aluable insights that usually surpass the capabilities of basic rule-based metho ds, and they are particularly useful when in tegrated in to larger optimization framew orks. These results further emphasize the need for more comprehensive and balanced data collection to b etter capture the complexities inherent in both tasks. 5.2 Results on 2LR C-TND T able 3 summarizes the computational results on optimization. Random F orest is emplo yed as C core and C adopt in 2LR C-TND b ecause its simplicit y for imple- men tation (compared to DNNs) and superior p erformance in prediction results demonstrated in prior exp eriments. Selected visualizations (due to page limit) of designs are shown in Figures 4. Examining the netw ork designs, it app ears that all runs op en a similar n umber of arcs within the same budget. This out- come o ccurs because the model tries to utilize the entire av ailable budget. The resulting net works largely preserv e existing MAR T A corridors while reallo cating capacit y tow ard underserved areas with high latent demand. The first b enc hmark, FD, assumes no demand uncertain ties and solves the CP mo del using only the existing 6,659 transit trips. Ho w ever, after the net work 14 H. Guan et al. T able 3: 2LRC-TND results. F or ev aluations, the rep orted metrics are av eraged across senarios: 50 ( I ) of them are used in solving and 1,000 ( I ′ ) only for addi- tional ev aluation. F or the A dopted% metric, the denominator is |T latent | . A veraged Results ov er 50 ( I ) SAA Senarios Ev al. with I ′ Budget ($) Run Run Time (hours) # Opened Bus Arcs # Core |T core | # Latent |T latent | # (%) Adopted Cov erage Cov erage Bound T ravel Time (min) Ev al ( z ) (# Violated) 30K FD 0.04 2824 - - - 13086 - 57.13 - Naive RB 2.23 2844 2139 36040 5676 (15.75%) 7814 8819 56.80 7809.15 (0) Deterministic 0.03 2701 4776 33403 966 (2.89%) 5742 5742 53.15 13338.58 (15) 2LRC-TND 12.00 2871 4259 33920 11132 (32.82%) 15391 16655 56.44 15390.11 (0) 35K FD 0.05 3148 - - - 13718 - 57.10 - Naive RB 1.98 3319 2139 36040 6182 (17.15%) 8321 8819 57.03 8313.00 (0) Deterministic 0.03 2838 4776 33403 966 (2.89%) 5742 5742 53.12 13474.61 (15) 2LRC-TND 3.88 3313 4259 33920 11745 (34.63%) 16005 16655 56.56 15999.53 (0) is designed, C core and C adopt (b oth using the random forest mo dels in 2LR C- TND) are applied to ev aluate the actual n umber of users who w ould adopt the service, with results rep orted in the “cov erage” column in the table. Without considering demand uncertain ties (run FD), the net work design will still attract riders to join the system, and some riders among the 6,659 will reject it. The new design b y FD can still attract some riders b ecause the ob jective do es not aim at budget savings, resulting in a quite extensiv e net work. In contrast, 2LRC- TND accounts for demand uncertainties and achiev es a muc h higher co verage b y appro ximately 20%. These results demonstrate that through the redesign of 2LR C-TND, the agency can attract additional riders by offering services to those who previously lac ked transit access. The column cov erage b ound indicates the maxim um ridership that an agency can co ver (assuming no budget limit). This v alue is determined prior to solving the CP mo del, during pre-pro cessing. The second b enc hmark, Naive Rule-based (RB), uses nested if-else conditions based on trav el time and origin-destination lo cations to generate probabilities, serving the same purp ose as C core and C adopt in 2LRC-TND but without ma- c hine learning. The first observ ation is, using the rule-based mo dels, the CP can b e solv ed optimally within a short time, whereas 2LR C-TND ( B = 30K) only rep orts the best feasible solution after the run time limit of 12 hours is reached. This is likely due to the machine learning mo del predicting a greater n umber of core riders, which requires the CP mo del to provide a path for eac h core rider, thereb y increasing computational difficulties. Secondly , although the core riders differ across runs, all experiments demonstrate that 2LR C-TND ac hieves a no- table n umber of adopted riders. Sp ecifically , RF yields a higher adoption rate b y learning directly from data, while Naiv e RB depends on man ually defined rules that lack consideration of geographical and demographic information. T o demonstrate the imp ortance of sto c hastic mo deling, a third b enc hmark emplo ying deterministic choice mo dels is ev aluated against 2LR C-TND. This configuration utilizes the same machine learning mo dels as 2LRC-TND; how- ev er, they generate deterministic binary decisions using a typical fixed threshold of 0.5, thus eliminating the need for SAA scenarios in I . In this case study , since the predicted probability of adopting a path for latent trips is typically 2LR C-TND 15 (a) Rail System & Rectangular Area (b) FD ( B = 30k) (c) 2LNC-TND ( B = 30k) (d) 2LNC-TND ( B = 35k) Fig. 4: Selected Maps Related to the Case Study . (Designs are Hyp othetical) b elo w 0.5 in most cases, this deterministic approach underestimates p oten tial ridership adoption. As shown in T able 3, the deterministic approach ac hieves a significan tly weak er adoption rate across both budget levels. F or core riders and adopted riders using the final net work design, T able 3 also rep orts their trav el times. It should b e noted again that the mo del itself do es not assign specific paths to riders. It ensures that core riders ha ve at least one feasible path in P r and also attempts to increase the num b er of feasible paths for latent riders. Once the netw ork design is es tablished, the fastest path can be selected for riders, since this approach is intuitiv e. Lastly , the designs are ev aluated using 1000 ( I ′ ) additional SAA scenarios. Results show that all core riders are serv ed under the resp ectiv e designs with no violation in most of the cases, which is likely b ecause each P r con tains four paths and the fixed rail arcs are effectiv ely utilized b y core riders. Ho wev er, Determin- istic case exhibits 15 constraint violations when ev aluated on these scenarios indicating the need for the sto c hastic decision-making models. A dditionally , the Ev al ( z ) results closely matc h the co verage v alues from the ev aluation with I for the 2LRC-TND approach, highligh ting that the results are highly stable after solving from only 50 SAA scenarios. 6 Conclusion This pap er introduces the 2LRC-TND framew ork to address the limitations of the traditional transit net work design approac hes that assume fixed demand. Lev eraging CSO, the framework in tegrates machine learning-based c hoice mo d- els and constrain t programming-based optimization mo dels to pro duce transit net work designs. A large-scale case study o v er 38,000 trips from A tlanta demon- strates that 2LR C-TND can improv e ridership within reasonable budget limits, highligh ting its practicalit y and p otential so cial impact. Several promising direc- tions for future research include design of more targeted surveys for enhancing data collection and extension of 2LR C-TND to emerging transp ortation systems. A c knowledgmen ts The w ork w as partially funded based up on work supp orted by the National Science F oundation under Gran t No. 2112533 and Gran t No. 2434302. Bibliograph y [1] A tlanta Regional Commision: Regional on-b oard transit survey 2019 final rep ort. https://cdn.atlan taregional.org/wp-conten t/uploads/final- rep ort-arc-2019-regional-transit-on-board-survey .p df (2020), accessed: 2025-06-15 [2] A tlanta Regional Commision: Activit y Based Mo del. h ttps://abmfiles.atlantaregional.com/ (2025), accessed: 2025-07-25 [3] Auad, R., Dalmeijer, K., Riley , C., Santanam, T., T rasatti, A., V an Hen- tenryc k, P ., Zhang, H.: Resiliency of on-demand multimodal transit systems during a pandemic. T ransp ortation Research Part C: Emerging T echnolo- gies 133 , 103418 (2021) [4] Basciftci, B., V an Hen tenryck, P .: Bilevel optimization for on-demand m ul- timo dal transit systems. In: Hebrard, E., Musliu, N. (eds.) Integration of Constrain t Programming, Artificial In telligence, and Op erations Researc h. pp. 52–68. Springer International Publishing (2020) [5] Basciftci, B., V an Hen tenryck, P .: Capturing trav el mo de adoption in design- ing on-demand multimodal transit systems. T ransp ortation Science 57 (2), 351–375 (2023) [6] Bertsimas, D., Kallus, N.: F rom predictive to prescriptive analytics. Man- agemen t Science 66 (3), 1025–1044 (2020) [7] Bertsimas, D., Kallus, N., Hussain, A.: Inv en tory management in the era of big data. Pro duction and Operations Management 25 (12), 2006–2009 (2016) [8] Bertsimas, D., Ng, Y.S., Y an, J.: Data-driv en transit net work design at scale. Op erations Researc h 69 (4), 1118–1133 (2021) [9] Borndörfer, R., Grötsc hel, M., Pfetsc h, M.E.: A column-generation ap- proac h to line planning in public transp ort. T ransp ortation Science 41 (1), 123–132 (2007) [10] Canca, D., De-Los-Santos, A., Lap orte, G., Mesa, J.A.: A general rapid net work design, line planning and fleet inv estment in tegrated model. Annals of Op erations Researc h 246 (1), 127–144 (2016) [11] Cipriani, E., Gori, S., Petrelli, M.: T ransit net w ork design: A procedure and an application to a large urban area. T ransp ortation Researc h Part C: Emerging T echnologies 20 (1), 3–14 (2012) [12] F arahani, R.Z., Miandoabchi, E., Szeto, W., Rashidi, H.: A review of urban transp ortation net work design problems. European Journal of Op erational Researc h 229 (2), 281–302 (2013) [13] Guan, H., Basciftci, B., V an Hentenryc k, P .: Path-based form ulations for the design of on-demand multimodal transit systems with adoption aw areness. INF ORMS Journal on Computing 36 (6), 1359–1756 (2024) [14] Guan, H., Basciftci, B., V an Hen tenryck, P .: Bilev el Optimization and Heuristic Algorithms for In tegrating Latent Demand in to the Design of Large-Scale T ransit Systems. T ransp ortation Science (2026) 2LR C-TND 17 [15] Guan, H., Gillani, N., Simko, T., Mangat, J., V an Hen tenryck, P .: Con- textual sto chastic optimization for school desegregation p olicymaking. In: Pro ceedings of the AAAI Conference on Artificial In telligence. v ol. 39(27), pp. 28024–28032 (2025) [16] Guihaire, V., Hao, J.K.: T ransit net work design and scheduling: A global review. T ransp ortation Research Part A: P olicy and Practice 42 (10), 1251– 1273 (2008) [17] Jiang, Z., Ji, R.: Optimising hurricane shelter lo cations with smart predict- then-optimise framework. International Journal of Pro duction Researc h 63 (8), 2905–2925 (2025) [18] Kleyw egt, A.J., Shapiro, A., Homem-de Mello, T.: The sample av erage ap- pro ximation metho d for sto c hastic discrete optimization. SIAM Journal on Optimization 12 (2), 479–502 (2002) [19] Klier, M.J., Haase, K.: Line optimization in public transp ort systems. In: Op erations Researc h Pro ceedings 2007: Selected P ap ers of the Ann ual In- ternational Conference of the German Op erations Research So ciet y (GOR) Saarbrüc ken, Septem b er 5–7, 2007. pp. 473–478. Springer (2008) [20] Klier, M.J., Haase, K.: Urban public transit netw ork optimization with flex- ible demand. Or Sp ectrum 37 , 195–215 (2015) [21] Lee, D., Derrible, S., Pereira, F.C.: Comparison of four types of artificial neural net work and a multinomial logit model for trav el mo de c hoice mo d- eling. T ransp ortation Research Record 2672 (49), 101–112 (2018) [22] Ma, Y., Zhang, Z.: T rav el mode choice prediction using deep neural net works with entit y em b eddings. IEEE Access 8 , 64959–64970 (2020) [23] Maheo, A., Kilby , P ., V an Hentenryc k, P .: Benders decomposition for the design of a hub and shuttle public transit system. T ransp ortation Science 53 (1), 77–88 (2019) [24] Sadana, U., Chenreddy , A., Delage, E., F orel, A., F rejinger, E., Vidal, T.: A surv ey of contextual optimization methods for decision-making under uncertain ty . European Journal of Operational Researc h 320 (2), 271–289 (2025) [25] Sc höb el, A.: Line planning in public transp ortation: mo dels and methods. OR Sp ectrum 34 (3), 491–510 (2012) [26] T ang, L., Luo, R., Zhou, Z., Colombo, N.: Enhanced route planning with calibrated uncertaint y set. Mac hine Learning 114 (5), 1–16 (2025) [27] W ang, W., Deng, S., Zhang, Y.: Data-driven ordering p olicies for target ori- en ted newsvendor with censored demand. Europ ean Journal of Op erational Researc h 323 (1), 86–96 (2025) [28] Xie, C., Lu, J., P ark an y , E.: W ork trav el mo de c hoice mo deling with data mining: decision trees and neural netw orks. T ransp ortation Research Record 1854 (1), 50–61 (2003) [29] Y e, T., Cheng, S., Hijazi, A., V an Hen tenryck, P .: Contextual sto c hastic optimization for omnichannel multi-courier order fulfillmen t under deliv ery time uncertaint y . Manufacturing & Service Op erations Management (2025) [30] Zhao, X., Y an, X., Y u, A., V an Hentenryc k, P .: Prediction and behavioral analysis of trav el mo de choice: A comparison of machine learning and logit mo dels. T rav el b eha viour and so ciet y 20 , 22–35 (2020)
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment