Analyzing animal movement using deep learning

* Shared first authorship † Shared last auth orship Analyzing animal movement using deep learning Thibault Fronville 1,2 ,* & Maximilian Pichler 3 ,* , Johannes Signer 4 , Marius Grabow 1 , Stephanie Kramer - Schadt 1,2 , Viktoriia Radchuk 1 ,† & Florian Hartig 3 ,† 1 Leibniz Institute for Zoo and Wildlife Research (IZW), Depar tment of Ecological Dynamics, Alf red - Kowalke - Straße 17, 1 0315 Berlin, Germany 2 Technische Universit ät Berlin, I nstitute of Ecology, Rothenburg str. 12, 12165 Berlin, Germany 3 University of Regensburg, T heoretical Ec ology, Univer sitätsstraß e 31, 93053 Regens burg, Germany 4 University of Goettingen, Wildlife Sc iences - Faculty of Forest Scie nces and Forest Eco logy, Büsgenweg 3, 37077 Göttingen, Germany Corresponding Aut hors: radchuk@izw - berlin.de , florian.hartig@biologie.uni - regensburg.de Abstract 1. Understanding how animals move through h eterogeneous landscapes is central to ecology and conservati on. In this context, step selection functions (SSFs) have emerged as the main statistical framework to analyze how biotic and abiotic predictors influence movement paths obser ved by radio trac king, GPS tags, or similar sensors. 2. A traditional SSF consi sts of a generalized linear model (GLM) that infer s the animal’s habitat preferences (selection coefficient s ) by comparing each observed movement step to a number of random steps. S uch GLM - SSFs, however, cannot flexibly consider non - linear or interacti ng effects of predictors on the movement , unless those have been specified a priori . To address this problem, generalized additive models (GAMs) have recently been integrated in the SSF f ramework , but those GAM - SSFs are still limited in their ability to represent complex habitat preferences and inter -i ndividual variability . 3. Here we explore the utility of deep neural networks (DNN s ) to overcome these limitations. We find that DNN - SSF s, coupled with explainable AI (xAI) to extract selection coefficients , offer many advantages for analyzing moveme nt data . I n the case of linear effects, they effectively retrieve the same effect sizes and p - values as conventional GLMs. At the same time, however, they can automatical ly detect complex interaction effects, nonlinear responses, and inter - individual variability if those are present in the data . 4. We conc lude t hat DNN - SSFs are a promising extension of traditional st ep selection models . Our anal ysis extends previous research on DNN - SSF by exp l oring differences and similarities of GLM, GAM and DNN - based SSF models in more depth , in particular regarding the v alidity of statistical indicators such as p - values and confidence intervals that are derived from the DNN . We also propose new DNN structures to capture inter - individual effects that can be viewed as a nonlinear random effect. All methods used in this paper are available via the ‘ citoMove ’ R package. Deep neural networks for step selection anal ysis 2 Introduction Animal movement is a fundamental ecological process that is the basis for a wide range of ecological and evolutionary phenomena . Understanding how, when, and why animals move is pivotal for predicting individual survival and reproductive success, and for explaining their spatial distr ibution and thus population and ecosystem dynamics (Bowler and Benton 2005; Nathan et al. 2008; Jeltsch et al. 2013; Bauer and Hoye 2014; Schlägel et al. 2019) . Developing analytical methods to explore how animal movement emerges in interaction with the environment is therefore crucial for ecological research and conservation efforts (Nathan et al. 2008) . The most common framework to analyze animal movement is the step selection function (SSF) approach (Avgar et al. 2016; Fieberg et al. 2021). SSF models analyze how an animal selects its next movement step by comparing the observed trajectory to all other possible movement options , or to a l imited set of predefined alternatives (see i ntegr ated SSF , Avgar et al. 2016; Fieberg et al. 2021 ). Mathemati cally, t his corr esponds to evaluati ng the relative likelihood of observed moveme nt steps by integrating over all possible movement steps. This use - availability design at the scale of each movement step ma kes it possible to quantify interactions of the anima l with its biotic and abiotic environments. The tradition al approach to fit an SSF model is to approximate the theoretically continuous kernel of all possible movement steps by a finite num ber of randomly sampled potential steps . The contrast between these random steps and the observe d steps is then fit using a conditional logistic regression, a generalized li near model (GLM) with a logit link ( Therneau 2024 ; Sign er e t al . 2 019 ; but other approaches exist, see Michelot et al. 2024 ; Muff et al. 2020 ). Th e creation of random st eps as a contr ast is similar to pse u do - absences in species distribution models , except that in a SSF model , a separate set of random steps is created exclusively for each observed ste p . These random steps are usually based on observed turning angles and step lengths to separate habitat selection and movement processes (Fieberg et al. 2021). This approach, which we refer to as a GLM - SSF , has been extensively applied in the analysis of empirical movement data , for example, to study animals’ selective preference towards landscape features (Thurfjell et al. 2014), anti - predator behaviors (Latombe et al. 2014), human - wildlif e interactions (Bjørneraas et al. 2011) or inter - individual interactions (Schlägel et al. 2019; Dick ie et al. 2020). A limitation of a GLM - SSF , as of any other parametric statistical model, is that the functional relationship descri b ing the influence of biotic and abiotic conditions on the movement must be specified a priori . In practice, GLM - SSF s typically rely on linear and quadratic effects for all predictors. T he real respons e of animals, however, is often more complex and likely to display nonlinear rel ationships or interactions between predictor variables (Mysterud and Ims 1998; Deep neural networks for step selection anal ysis 3 Avgar et al. 2016) . For instance, an animal might prefer moderate vegetation cover for foraging but avoid both open areas (due to predation risk) and dense vegetation (which impedes movement), resulti ng in a nonlinear , bell - shaped selection pattern. Also , habitat preferences are not always constant over time (Richter et al. 2020; Forrest et al. 2025a), and might shift , for example, between t he seasons (Boyers et al. 2019). Moreover, even when the random steps are sampled from the observed movement distribution , it may be necessary to still include movement variables in the model , because habitat structure can modify the movement decisions (iSSF, see Avgar et al. 2016; Fieberg et al. 2021 ) . While it is in principl e possible to represent such complex relationships in a GLM - SSF , it is difficult to anticipate the functional form of these patterns a priori , motivating the search for modelling approaches that more flexibly adjust to the observed data. Movement ecology r esearch in recent years has therefore focused on finding more flexible modelling approaches, for example by including time - varying selection patt erns (Richter et al. 2020) , incorporating multiple movem ent types (e.g. foraging and migration) to reflect behavioral variation (Klappstein et al. 202 4) or accounting for inter - individual heterog eneity in habitat selection (Chatterjee et al. 2024; Klappstein et al. 2024; Muff et al. 2020). These extensions have been implemented using a range of statistical frameworks, including generalized additive models (GAM s) , mixed - effects models , a nd state - dependent (e.g. hidden Markov) model s. Accountin g for inter - individual variability in the SSF is a particularly interesting problem because animals’ habitat selection oft en varies strongly through in dividual differences or personality (e.g., shy vs. bold behavior ; Sloan Wilson et al. 1994 ) . Ignoring such inter - individual variabi lity can result in bi ased selection est imates (Lesmerises and St - Laurent 2017) . To captur e inter - individual variability and more complex habitat selection patterns , Klappstein et al. (2024) integrated random slopes and hierarchical smoothing terms (GAMs) into SSFs ( GAM - SSF s). While thes e models offer more flexibility than GLM - SSF s, they are still limited in some dimensions , particularly when including a large number of nonlinear interactions (see , e.g., Pedersen et al. 2019 ) or when considering interactions between habitat selection and movement preferences . To allow for a more flexible approach to modelling animal movement , a logical next step is to follow the general trend i n ecology towards flexible machine learning (ML) and deep learni ng (DL) algorithms as an alt ernative to the more rigid classical models from parametric stati stics (Christin et al. 2019; Borowiec et al. 2022; Pichler and Hartig 2023b). ML is already being used in movement ecology to classify behavioral states from their movement (Christin et al. 2019; Wang 2019) or species from images (Tabak et al. 2019) ; predict movement trajectories (Torney et al. 2021); or analyze habi tat use (Cífka et al. 2023; Forrest et al. 2025b) . For example, ML has offered the possibility to classify animals' activities such as feeding and Deep neural networks for step selection anal ysis 4 resting from analyzing camera traps images (Norouzzadeh et al. 2018) . Another example is Browning et al. (2018), who predicted the diving behaviors of seabirds from GPS data wit h deep neural networks (DNN) . Consistent with this development , some recent studies successfully integrated ML and DL approaches within the SSF framewo rk (Wijeyakulasuriya et al. 2020; Cífka et al. 2 023; Forrest et al. 2025b) . We c onsider DNNs to be a particularly promising approach for this purpose, as they naturally account for nonlinear, high - dimensional (higher order interactions between variables) relationships while being able to retrieve near - unbiased parameter estimates (Pichler and Hartig 2023a) . Furthermor e, they are highly extensible and can be adapted to more advanced DL architectures . For example, Cífka et a l. (2023) employed a Transformer - based movement mo del (MoveFor mer) that considers past movement steps in the next movement decision, and Forrest et al. (2025b) used a convolutional neural network ( CNN ) architecture (deepSSF ) that is capable of processing complex inputs such as images/sounds (Forrest et al. 2025b). T hese first results highlight the promise of DNN for SSF models , but important questions remain that hinder their wider application for anal yzing animal movement. A first fun damental question is whether DNNs, whi ch are clearl y more flexible, but potentially also data - hungry, work wel l with comparatively small dataset s that are typical for wildlife ecology. Second, neural networks do not natively provide effect sizes and p - values, which is cr ucial for ecolog ists who w an t to interpret the fitted models to understand the mechanisms b ehind animal move ment. E xplainable AI (xAI) techniques can make these “black - box” models mor e interpretable (Ryo et al. 2021) and even pr ovide p - values and confidence intervals . Howe ver, since ML models often trade bias against variance in parameter optimization (Shmueli 2010), it is somewhat unclear whether these extracted effects and their statistical validity are reliable in situations that are typical for wildlife ecology. And thirdly, DNN s have a reputation for being difficult to use and less well supported in the R envi ronment than comparable methods such as GLMs or GAMs , which arguably prevents uptake of these new methods in the movement ecology community . Here, we address all three challenges by comparing the performance of DNN s with GLM and GAM approache s for understanding the drivers of animal movement . Unlike other previous studies that introduced neural networks in the SSF framework, we concentrate on fully connected DNN (a lso known as multi - layer perceptrons or MLP s ). Wh ile these are co nsidered the simple st network architecture s , they are also closest in structure and inputs to the more established algorithms such as GLMs or GAMs . Moreover , for MLPs, it is relatively straightforward to extract effect sizes and p - values , which allows us to compare their inferential Deep neural networks for step selection anal ysis 5 performance with the established models in a typical ecological analysis, rather than comparing only their predictive performance . Using simulat ed data, our a nalysis foc uses on the ke y challenge s commonly encountered in real - world movement data: (i) infer ring correct p - values and uncertainties for linear effects, (ii) captur ing nonlinear patterns of habitat selection, (iii) reveal ing complex interactions between environmental predictors, and (iv) account ing for inter - individual variability in selection behavior . We finally applied DNN - SSF s to a case study (based on Still fried et al. 2017 ) with tracking data from wild boar inhabiting urban and rural areas . Here, our DNN - SSF s were able to i nfer biologically plausible differences in movement behavior that correlated with individuals' home ranges . Methods The step selection f unction (SSF) framework The idea of the ste p selection function (SSF; For tin et al. 2005; Rhodes et al. 2005 ) is to build a statistical model that compares the decision of the animal taken at each movement step to its (infinite) possible alternative movement steps . In t he traditional SSF model , these alternatives were defined based on general charact eristics of the movement process, such as average step length and turning angle . In practice, this wa s done by sampling, for each observed step, several random steps from a movement kernel fitted to the observed mov ement steps. The sets of observed/available steps form groups that are called strata, with each stratum representing a single decision point in the ani mal's movement. This approach effectively factors out the gener al movement characteristics of the animal through the random sample, resulting in a model with habitat selection effects only. The downside of this approach, however, is that the observed m ovement characteristics ( e.g. step length ) may be affected by the structure of the habitat. Thus, the approach can create biased estimates if the habitat structure substantially influences t he general movement characteristics. This limitation was later addressed by Avgar et al. (2016), who proposed an extension of the SSF approach that integrates movement behavior and habitat selection i nto a unified framework, now commonly referred to as integrated step selection functions (iSSFs). A common way to implement this model is via a conditional logistic regression with a compound likelihood : 𝑝 ( 𝑠 ! " # | 𝑠 ! ) = ' $ % &' !"# ( ) % ' !"# *' ! ( + $ % &, ( )- 𝑧 . 𝑠 ! / 0, $ equation ( 1) 𝑤 ('𝑠 ! " # ) ' = ' exp (𝛽 # 𝑥 # (𝑠 ! " # ) + 𝛽 1 𝑥 1 (𝑠 ! " # ) + ⋯ + 𝛽 2 𝑥 2 (𝑠 ! " # )) equation ( 2) Deep neural networks for step selection anal ysis 6 where p is the probability of moving to a location 𝑠 ! " # given the previous location 𝑠 ! ; 𝑤 is the exponential (log - linear) selection function that describes the effects 𝛽 2 of the environmental covariates 𝑥 2 at location 𝑠 ! ; and ϕ is a selection - independent movement kernel that reflects the movement patterns of the animal in a homogenous landscape. In practice, ϕ is often expressed as a function of the step length and turning angle. The iSSF model thus understands movement as a compound process where the animal has a certain general movement behavior ( described by the kernel ϕ ) which is modified by habitat preferences (described by the selection effects w). In practice, evaluating the l ikelihood in eq.1 requires integrating over all possible steps the animal could have taken ( Ω ) . As this integra l typically has no closed - form solution, it is typically approximated using Monte - Carlo integrat ion , i.e. by randomly sampling a finite set of possible steps for each observed step . For numeric effici ency , these random steps are usually sampled non - uniform according to the empirically observed distributions of step lengths (e.g. , Gamma distribution) and turning angles (e.g. , von Mises distribution) , as in the normal SSF . In this case, however, the parameters of the fitted moveme nt k ernel ϕ have to be corrected later in the likelihood or in the parameters to account for the non - unif orm sampling (Michelot et al. 2024) . A certain downsi de of this app roach i s that it makes modelling statistical i nteractions between the movement kernel ϕ and the selection function 𝑤 technically challenging, a problem which will be relaxed by our DNN - SSF later. This traditional parametric conditional logistic SSF model is impl emented in several statistical tools, among them the “amt” (Signer et al. 2019) or “ survival” (Thernea u a nd Grambsch 2000) packages in R. As discussed in the introduction , a crucial limitation of these parametric models is that they require the user to specify the functional relationships that are assumed a priori . This requirement is somewhat rela xed by the possibility to impl ement SSFs with GAMs through the R package “mgcv” ( Wood 2011 ; f or more details see Klapps tein et al. 2024 ) , which can flexibly fit functional relat ionships , but still require explicit specification of other structures, in part icular int eractions. Th e goal of finding a more fl exible model mot ivates the use of DNNs in the SSF framewor k . Introducing D NNs in the step selection framework As d iscussed in the intr oduction, in theory, it is straightf orward to repl ace the para metric movement and sele ction functions by a neural network. In practice, ho wever, few st udies have tested this approach so far. A fi rst problem for assessing the performance of a DNN - SSF is that , unlike for GLM - SSF and GAM - SSF, there is so far no R package available for fitting these models “ off - the - shelf ” . It is of course possible to create custom implementations of DNN - SSF Deep neural networks for step selection anal ysis 7 in the existing DL frameworks in R or python, as done i n previous studies (e.g. Cífka et al. 2023; Forr est et al. 2025b ) , but this approach makes re - use tedious and thus limit s the utili ty of the methodological insights generated for ecologica l practitioners. For gener al neura l network s, the ‘ cito ’ R package (Amesoeder et al. 2024a) aims at solving th e problem of maki ng these models available to a wide range of users . ‘cito’ provides an R interface to ‘torch’, a state - of - the - art deep learning framework (Falbel and Luraschi 2025). Th e ‘ cito ’ model specification philoso phy follows the structure of many popular R regressi on packages . Specifically, it uses the R formula syntax to specify the predictors and offers the ability to switch between di fferent families and link functions. Moreover, cito provides many downstream functions, in particular explainable AI (xAI) methods to extract or plot the effect of predictors on model predictions. However , cito does not provide support for the conditional logit link, which would allow using it for the analysis of animal movement. To sol ve th is pr oblem, we dev eloped a wr apper that extends ‘ cito ’ to DNN - SSF s. To explain the functioning of this new ‘ citoMove ’ package, remember that, in a traditional SSF, the probability of selecting a step to location 𝑠 ! " # , given the environmental covariates 𝑥 3 at that location (see eq. 1 ), depends on 𝑝 ( 𝑠 ! " # | 𝑠 ! ) '~'𝑤3𝑥 &' !"# 4𝜙(𝑠 ! " # , 𝑠 ! ) equation (3) To enable more flexible modeling, we replace the right side of the equation by a neural network parameterized by weights 𝜃 : 𝑝 ( 𝑠 ! " # | 𝑠 ! ) '~ ' exp ( 𝑓 4 (x 5"# , s 5"# , 𝑠 ! ) ) equation (4) The tra ining of the se wei ghts i s the n del egated to the ‘ cito ’ R package. Thus, ‘ citoMove ’ simply replaces the parametric specifications of both the selection function and the movement kernel in a conventional GLM - SSF with a fl exible neural network, whi le everything else (in particular the link function and the likelihood) stays identical. Interestingly, this approach that integrates the movement kernel ϕ and t he selection function 𝑤 into a single function naturally also consider s interactions between m ovement and habitat selection. A disadvantage of u sing a DNN to model the ani mals' response to the pre dictors is that a DNN does not directly provide inter pretable effect estimates and associated uncertainties such as p- values and confidence intervals. However, effect estimates and variable importance can be retrieved post - hoc, for example by exploring the (average) response of the fit ted model to changes in the inputs. There are a large number of methods and metrics available for t his purpose, collectively known as explainable AI (xA I , e.g. Ryo et al. 2021; Holzi nger et al. 2022; Dwivedi et al. 2023 ). For ‘ citoMove ’ , we make use of the fact that cito already calculates several xAI metrics, most importantly permutation importance, average conditional effects Deep neural networks for step selection anal ysis 8 (Molnar et al. 2020; Pichler and Hartig 2023a) , and accumulated local effect plots (Apley and Zhu 2020) . A certain compli cation of DL models, which may also b e seen as an advant age, is th at they have many hyperparameters that control the complexity of the learned input - output relationship. These hyperparameters allow adjusting the complexity of the fitted relationships to the specific applicati on ; however, it also means that these hyperparameters should be optimized, typically by varying them and observing model performance on an independent hold - out dataset . Although we did not use these f eatures in our current ana lysis , ‘ cit o ’ aids the modeler in this task through several built - in functions for hyperparameter tun ing. Comparison of DNN - SSF to GLM - SSF and GAM - SSF Having develop ed a DNN - SSF wit h essentiall y the same usabilit y and functions as the more traditional SSFs based on GLMs or GAMs, we proceeded with five experiments to compare the performance of the three modeling approaches in simulated scenarios where the true underlying movement process is perfectly known ( Fig ure 1) . The first three experiments compare performance of the three models in a) estimating linear effects (including p - values) of a single linear predictor ( Figure 1 , scenario 1), b) estimating nonlinear effects of single predictors ( Figure 1, scenario 2), and c) estimating statistical interactions of multiple predictors ( Figure 1, scenario 3). The last two experiments do not have a direct counterpart in GLMs or GAMs, but highlight uniq ue advantages of DNN - SSF s, namely d) the assessment of inter - individual v ariability in selection (similar to a r andom int ercept e ffect in GLMMs or hierarchical GAMs; Fig ure 1, scenario 4), and e) the assessment of dynamically changing predictors, like distances between moving individuals ( Figure 1, scenario 5). Deep neural networks for step selection anal ysis 9 Figure 1: Summary of the five scenarios built to test the inferential performance of DNN - SSF s. The first three scenarios were built t o compare the inferential performan ce of DNN - SSF s to existing methods ( GLM - SSF & GAM - SSF ). The 4th and 5th scenarios were built to explore the perf ormance of DNN - SSF s in inferring inter - individual variability in habitat selection and dynamically changing predictors. Our first scenario was buil t to test the inferentia l performanc e of DNNs in recovering a linear effect of a single environmental predictor with correct statisti cal properties (5% type I error rate of the p - values , nominal coverage of the confidence intervals, unbiased parameter estimates ). We calculate d p- values and confidence intervals for the DNN - estimated effects based on 20 bootstrap replicates . We compared the results to a GLM - SSF only, as this model is theoretically the best choice if one is certain that effects are linear. Due to its higher flexibilit y, we expected that the DNN - SSF would perform slightly worse than the GLM - SSF ( as predicted by the bias/variance trade - off ). The goal of the experiment was, however, to show these performance losses are moderate. For the second scenario , we tested the DNNs' ability to infer a nonlinear effect of one predictor. W e compared the inference of DNN - SSF s to the inference obtained with GAM - SSF s using Deep neural networks for step selection anal ysis 10 the ‘mgcv’ package (Wood 2025) with de fault spline penalties , which we see as the state - of - the - art choice for recovering univariate nonlinear effects . We expect that DNN - SSF s can infer nonlinear responses similar ly to GAM - SSF s. For our third scenario , we tested the performance of DNNs in inferring s tatistical inte ractions among multiple predictors. For each effect , we ran 100 repe titions . From the estimated slopes, we then calcul ated the p - value, m ean s quared e rror (MSE), and variance. This scenario serves to demonstrate advantage s of DNNs as flexible models i n higher - dimensional settings, by adding multiple predictors that are also in pairwise interactions and assess whether DNNs can reveal non - additive relationships that GLM - SSF s might miss. In our fourth (d) and fifth (e) scenario, our goal was not the comparison to established models but investigating a feature of DNNs that does not have a direct counterpart in GLMs or G AMs. This feature arises in the context of categorical grouping variables such as individuals or years . I n a DNN, these can enter the model directly as any other variable and thus moderate the effect of environmental predictors corresponding to normal interactions in a G LM , but it is also possible to implement a model architecture where categor ical predictors are first embedded into a lower - dimensional numeric space before they are used in the larger network (Guo and Berkhahn 2016 ) . The op timal number of embedding dimensions depends on the complexity of the under lying hierarchical structure of the pr e dictor and can be tuned . Similar to an ordination , the position in the embedding space correspond s to how similar individuals are in their environmental responses (e.g. , habit at selection) . To interpret the embedding space, we can back - project how environmental effects are dependent on the embedding positions , akin to a bi - plot in an ordination . The embedding structure may be used to capture inter - individual variability, similar to a random intercept and random slope structure in mixed - effects models (GLMMs or GAMMs) or in hierarchical GAMs (Pedersen et al. 2019), but with the advantage that this extends to all kind of similarities (al so of individual interactions) in a nonlinear way while still having a relatively simple visual interpretati on . In our fourth scenario ( Fig ure 1 ), we tested the ability of this embedding architecture to represent and infer inter - individual variability in habitat selection. To do that , we simulate d 20 individuals divided into four groups, which can be thought of as four personality types (five individuals per group) that react to an environment with 20 predictors . Individuals of the same group respond similar ly to the environmental predictors. We d esigned the simu lation such th at five of t he 20 environmental predictors did not affect the step selection of any i ndividual at all ( β = 0) . Among the 15 predictors with effect, coefficients for 10 environmental predictors were randomly generated ( β > - 3 & β < 3) and the other five envi ronmental predictors were correlated ( β > - 3 & β < 3). The reason for the setting was that we expected corre lated and uncorrelated predict ors to manifest differently in the embedding space which can be seen in Deep neural networks for step selection anal ysis 11 our results. For this scenario , one hierarchical level (with two embedding dimensions ) is present (grouped individuals). In the fifth scenario ( Figure 1) , we consider the problem that individuals may not only differ in their response to the environment, b ut also in their effect on other individua ls. Specifically, we consider three group s of five individuals each that move together in a homogenous environment. At each time step , the nearest other individual would act as an ‘opponent’ to influence the movement of a focal individual (‘social’ predictor). Depending on the t ype of the opponent, focal individuals are either repelled, attracted or respond neutrally to the opponent (these responses are referred to as “individual interactions”). This example is designed to simulate a social structure where individuals' movement preferences vary depending on the specific individuals in their surroundi ngs. For this scenario, one hierarchical l evel (with two embedding dimensions ) is present (grouped opponents). For all scenarios , simulation settings were chosen to reflect sample sizes typical for empirical movement data. The DNN models were tr ained for 100 to 15 0 epochs (training times). We confirmed that this was enough to reach stable convergence in all scenarios. As the DNNs were relatively simple, training the weights took only seconds to minutes per net work , depending on dataset size and model complexity. Case study As our results in d) will demonstrate, the DNN - SSF embedding architecture is able to infer inter - individual variability in nonlinear responses, and this variability can be visualized in embedding space, where individuals that are closer to each other in the embedding space are more simil ar in their movement predict ors. To show the practical use of this, we applied the method to real animal tra jectories to explore t he inter - individual variability in the move ment of eight wild boars ( Sus scrofa ) , with individuals inhabiting the urban area of Berlin , Germany’s ca pital, and rural surroundings of the federal state of Brandenburg . In the original study (Stillfried et al. 2017) , wild boar s were tracked at 30 min intervals and grouped into urban and rural individuals based on their vicinity to the urban matrix to study resource selection . Importantly, the grouping was a priori and not based on their movement behavior . This allows us to test if clustering individuals by their movement behavior would match the home range - based grouping . The data were subsampled to a time resolution of 720 minutes, follow ing Signer et al. (202 5). For each observed step, we rand omly drew 20 steps from the empirical gamma distribution of step l engths (sl_) and an empirical distribution of the turning angles (ta_), wh ich was e stimated using the R package ‘PDFEstimator ’ (v 4.5, Farmer and Jacobs 2018 ), and compared them to their observed step. We used the distance to water (dt w), refuge habitat (hab ; percent tree Deep neural networks for step selection anal ysis 12 cover per raster cell ) and impervious surface (imp) of the urban matrix as covariates ( for f urther information on environme ntal covariate preparation see Signer et al. 2025 ). We used permutati on importance to infer the importance of the selection and movement variables , as well as the pair - wise importance of their interactions . Effects we re visualized using accumulated local effect plots. Results For linear effects DNN - SSF s show low bias and well - calibrated p - values and confidence intervals Our firs t scenari o was designed to ensure that DNN - SSF s can learn simple linear effects of the environment on the movement, and that p - values and confidence intervals calculated from the DNN using our xAI methods are well calibrated. Our results show our DNN - SSF accurately recover s the li near effect of the environment across a sensible range of effect sizes ( Figure 2 ). Simulated effect sizes ranged from - 2 t o 2 for a standardized predictor , which amounts t o strong effects considering the logit l ink , showing that the DNN delivers near - unbiased eff ects across the entire possible range of effect sizes. The Type I error rates for bot h DNN - SSF s and GLM - SSF s were cl ose to t he nominal 0. 05 level, as expected from a perfectly calibrated model . The coverage, i.e. the proportion of true slope s lying within the 95% confidence interval of the estimate s , was also close to the expected 0.95 level . The only performance issue we found in comparison to the GLM - SSF is that effect estimates of the DNN - SSF s showed a slight , albeit non - significant, bias (difference between the true and estimated slope) towards zero for strong effects (see Fig ure 2b ). This means that we see (non - significant) hints that the DNN shows a slight regularization bias that pushes effect estimates towards smaller values in the absence of overwhelming data, which would be in line with existing res earch (e.g. Pichler & Hartig, 2 023a). Deep neural networks for step selection anal ysis 13 Figure 2 : Validation of DNN effect estimates for a linear response to habitat : we compare slope estimates between DNN - SSF (red) and GLM - SSF (blue ) models. Data simulated with a linear respons e to one predict or reflecting a range of possible strengths of the habitat preference from - 2 t o 2 . a ) Ea ch poi nt represents the mean estimated slope across simulations for a given true slope (ranges from - 2 to 2), with vertical error bars i ndicating 95% confidence intervals. The d iagonal black das hed line represen ts the true slope, which ser ves as a reference baseline. Histo grams of p - values under the nul l hypothesis (true slope = 0) are shown above and below the main panel for the DNN and GLM models, respectively. b and c ) Bias (difference between the tru e and estimated slope), coverag e (true slope within the 95 % confidence interval), abbrevi ated as ‘cov’, and the corresponding p - value for DNN - SSF ( b ) and GLM - SSF ( c ). X - axis of b & c is the same as the one on a . We fitted a GAM to check whether the bias is significantly di fferent from zero. Alt hough there is a s light pattern in t he GAM fit for the DNN (b ), the p - values for both bias checks were not significant. DNN - SSF s are as good as GAM - SSF s at recovering nonlinear effects Our second scenar io aimed to check whether DNNs can correctly infer the shape of nonlinear environmental effect s on the movement . We simul ated two scenarios, on e wi th a r elatively simple hump - shaped response that could arise from a niche - type environmental preference ( Figure 3a ), and one scenario with a more complicated wiggly response ( Figure 3b ). B oth the DNN - SSF s and GAM - SSF s were able to accur ately approximate these functional responses ( Figure 3 ). Although subtle in the visual representation, the mean squared error (MSE) was slightly lower for the DNN - SSF compared to the GAM - SSF , which may be connected to the Deep neural networks for step selection anal ysis 14 fact that the DNN - SSF was better able to recover the functional complexity of the true underlying curve, whereas the estimated effect of the GAM seems a bit too “ wiggly”. Figure 3 : Inferring nonlinear effect (smoothing): A comparison of a DNN - SSF (red) and a GAM - SSF (blue; with a smoothing term on the predictor) when recove ring two nonlinear functional relationships between a single predictor and the movement from simulated data . The true shape of the function al response is represented by the black dashed line, while the colored lines correspond to the mean estimates of the two models. The shades represent the 95% confidence i ntervals of the mean estimates , and the m ean s quared e rrors (MSE) are indicated in red and blue . DNN - SSF s are good at recovering statistical interactions between many predictors Our third scenario was designed to compare DNN - SSF s and GLM - SSFs in their ability to infer linear main ef fects and linear interactions from a larger number of predictors (9 predictors, 9*8/2 = 36 possi ble interactions). The challenge i n this set ting is not so much the complexity of the response, as in the previous scenario, but the comparatively large number of predictors and their interactions. Our res ults sh ow that both models were able t o accur ately identify the true main effects, as indicated by a high alignment of the circle sizes on the diagonal with the respective colored point sizes and by the low MSE values ( Figure 4 ). We do not find p ronounced differences between both approaches. Deep neural networks for step selection anal ysis 15 For the interactions, both models i dentified the three true interactions as important predictors. The DNN - SSF tended to slightly underestimate the effect of the true interactions and thus exhibited lar ger MSE and bias for these val ues ( Figu re 4 ); whereas the GLM - SSF tended to show more variance ( Figure S 1) and larger MSE ( Figure 4 ) in effect estimates for interactions that were zero. This pattern is in line with the expec tation that a DNN creates a shrinkage bias (Pichler and Hartig 2023a), which means that the model tends to push all interaction estimates towards smaller values and thus trades o ff bias for reduced variance . Averaged across all effects, the MSE was 0.08 for the DNN - SSF and 0.14 for the GLM - SSF , showing that the DNN - SSF outperforms the GLM - SSF regardin g the average estimatio n error. This observation is compati ble with the g eneral insight in machine learning that in sit uations with many predicto rs and few data, a shrinkage bias as ident ified above often leads t o more stable parameter estimates (Hastie et al. 2009). Figure 4: Inferring effect s of nine pr edictors with multiple interactions: The fi g ure visualizes the estimated main effects (diagonal) and two - way interaction s of nine predictors (x1 to x9), comparing performance between a DNN - SSF ( a, left panel) and a GLM - SSF ( b, right panel). The data were simulat ed with main effects increasing in both directions from x5, which it self had no effect, and i nteractions between x1 - x8, x3 - x7 and x4 - x6, indicated by black circles at the respective intersections . The size o f each black circle reflects the true effect magnitude, whil e the size of the overlaid colored disk represents the estimated effect. Deep neural networks for step selection anal ysis 16 Complete o verlap between the black circle and the colored disk therefore indicates unbiased estimation. Color represe nts the mean squared error (MSE) of the estimated effect. Averaged across all effects, the MSE was 0.08 for the DNN - SSF and 0.14 for the GLM - SSF . For the DNN - SSF the average MSE was 0.27 for the true interactions, 0. 08 for null i nteractions , and 0.03 for the ma in effects. Correspondin g v alues for the GLM - SSF we re 0.15, 0. 18 , and 0.02. The s ign information (“+” or “−”) inside circles show the direction of the true effect. The DNN was trained with a learn ing rate of 0.01 for 150 epochs. DNN embeddin gs can detect inter - individual differences in environmental preferences In our fourth scenario, we simulated data for 20 individuals that were divided into four grou ps, with e ach group respond ing differently to the environmental predictors , akin to four different animal “personalities” . Our results show that a DNN - SSF with an embedding layer correctly retrieves these four group s , which manif ests by individuals from the same group clustering in the embedding space ( Figure 5) . Furthermore, through back - projecting the inf luence of the embedding position on the covariate effects, we find patterns that correspond with our simulated processes: C ovariates with no effec t on t he movement (β = 0) are not cor related with the embedding position of the individuals ( Figure 5 , black arrows ) . For predictors without collinearity, the embedding positions have varying but uncorrelated effects on the habitat selection ( Figure 5, green arrows). For the group of collinear predictors , the embedding position has similar ef fects on the habitat selection , as one would expect ( Figure 5, violet arrows ) because these predicto rs were generated as variations of the same un derlying environmental gradient. Overall, our resul ts show that the embedd ing stru cture was successfu l in identifying individuals with similar behavior or personality , and back - calculating the effect of the embedding on t he movement behavior allows interpreting the embedding position of individuals in terms of their behavioral differences in a meaningful way . Deep neural networks for step selection anal ysis 17 Figure 5: Individual response s of the focal individu als to environmental predictors . The embedding space depicts the proj ection of the response of 20 individuals to 20 environmental predictors. The shape of the points indicate s the four different groups of individuals. The ellipses indicate group - level clustering within the embedding space, capturing the within - group similarity in selection patterns. Elli pses are added for visualization and do not represent the results of a formal clustering analysis. Arrows represent the effect of the embedding position on the selection coefficient s of the individuals. They are color - coded by type: independent of each other (green ), correlated ( violet ), and those with no effect ( black ). The length of an arrow reflects the magnitude of the predictors effect, with longer arrows indicating a stronger effect. DNN - SSF e mbeddings c an detec t diffe rent cl asses of opponent s In our fi fth scenario, we aimed to evaluate whether DNN embeddings can detect differences in how opponents affect movement of focal individuals. We therefore applied the embedding layer to the opponents rather than the focal individuals and simulated a sce nario with 3 types of animals that act 1) neutral, 2) attractive, or 3) repellent on all ot her individuals. The results reveal again a clear grouping pattern within the embedding space (Figure 6), indicating that the model effectively captures variations i n the opponents’ effect on focal individuals ’ movement. Moreove r, the p ositioning of the gr oups relative to the effect of the embedd ing position (arrow in Figur e 6) is easily interpretabl e and in line with our simulations: the neutral Deep neural networks for step selection anal ysis 18 group is placed in the center, and the repelling group selects more posit ively and the attracting group more negatively f or distance to the opponent , as indicated by the arrow . Figure 6: Effects of individua l oppone nts on the movement of the focal indivi dual. The embedding space depicts the projection of the effects of 15 opponent individuals (nearest individual to the focal individual) on the focal individual. The shape of the points indicates the three different groups of opponents. The effect of the embedding position on the select ion coefficient is depicted by an arrow. In t his example, the only predictor of the movement is the distance to the opponent, so there is only one ar row. Note that the positioning of the groups relative to the distance arrow is in line with our simulations: the neutral group in the center, the repelling group (e.g. prey fleeing a predator) selects positively for distance , and the attracting group (e.g. social group movement) selects negative ly to distance. Case study As a final case study, we applied a DNN - SSF with embedding archi tecture to data of eight wild boar (Stillfried et al. 2017) at a resolution of 720 min. Using permutation importance, we quantified the influence of the five predictors on movement decisions of wil d boars. Distance to water (dtw) and imperviousness showed the highest overall importance (Figure 7a). Turning angle ( ta_) appeared as the third - most important pr edictor. This is interesting because, having Deep neural networks for step selection anal ysis 19 sampled the random st eps from a non - parametric kernel estimator, we would not necessar ily have expected large tur ning angle effects ( which essentially correct the initi al estimate). An analysis of pair - wise importance ( Figure 7b) shows that these val ues mo stly originate from interactions of the turning angle with distance to water and imperviousnes s, suggesting that the directionality of the movement changes with habitat context. This finding underlines a distinct benefit of the DNN - SSF, as such movement - habitat int eractions ar e difficul t to implement in conventional SSF models. For distance to water, the most important variable, we found nonlinear negative effects that differ only slightly among individuals (Figure 7c). Examining differences in the embedding position of the individuals , we found that rural wild boars form a distinct cluster, suggesting that they show similar habitat selection. The cluster position suggests that rural individuals showed a stronger selection for habitats farther fr om wa ter compared to urban individuals. (Figure 7d). Urban wild boars differ more strongly in their mo vement behavior. Behavioral differences are mostly explained by turning angle and imperviousness (Figure 7d). These results overall support the hypothesis that urban wild boars have adjusted their behavior to their urban habitat, but there seems to be a stronger variation of behavior between urban individuals, possibly related to stronger landscape heterogeneity and disturbances in urban habitats . Deep neural networks for step selection anal ysis 20 Figure 7: Analysis of wild boar data (Stillfried et al. 2017) with a DNN - SSF . Eight wild boars were trac ked in Berl in (urban) and in its surro unding rural area . We estimat ed a DNN - SSF simultaneously for all eight individuals , that were subsampled to a time resolution of 720 minutes (following Signer et al. 2025) with turning angle (“ta_”), imperviousness ( “ imp ”) , distance t o water ( “ dtw ”), step length (“ sl_ ”) , and % tree cover as refuge habitat (“hab”) as predictors . The selection effects of these predictors are modified by an embedding layer in the DNN. Looking at the permutation importance of the predictors (a ), distance to water and imperviousness had the strongest eff ects . The inter action importance showed strong interactions between the selection variables (imperviousness and habitat), as well as between the selection variables and the movement, particularly the turning angle (b). Accu mulated local effect plots of the distance to water (c) show a nonlinear decaying effect of the predictor on the selection choice. Behav ioral differ ences between rural individuals we re smaller than between urban individuals, with distance to water explaining the main difference between urban and rural i ndividuals and turning angle and imperviousness explaining variation between urban individuals (d). Deep neural networks for step selection anal ysis 21 Discussi on The goal of our study was to evaluate the performanc e of fully connected de ep neural n etworks as a flexible curve fitting structure within the SSF framework. Our results show that DNN - SSF s can be a powerful extension of the step selecti on framework . They compare well against alternative models such as GLM - SSF or GAM - SSF, especially for capturing compl ex nonlin ear selection patterns with predictor interactions and inter - individual variability. In particular, we showed that DNN - SSF s can correctly retrieve linear eff ects, including confidence intervals and p- values ( Figure 2 ) as well as nonlinear effects of single predictors similar to a GAM ( Figure 3 ); perform well at inferring sparse interaction structures between multiple predictor variables ( Figure 4 ) and allow model differ ences betw een data groups (e.g. individuals) in an embedding structure that acts similar to a nonlinear random effect on all predictors , but offers an easy interpretation akin to an ordination biplot that shows the similarity effects between individual s or groups (Fig ures 5, 6 ). Mo reover, the DNN - SSF naturally considers interactions between habitat and the movement kernel (Figure 7b) , whic h seems an important advantage for the analysis of empirical movement data . O ur results complement existing pioneer ing studies on the use of deep learning ( DL ) model s in movem ent eco logy that concentrated more on the benefits of DL for using complex predictors (Cífka et al. 2023 ; Forrest et al. 2025b) and less on their inference capabilities. Our results suggest that the DNN can retrieve nearly unbiased effect estimates that are in line with Pichler and Hartig (2023a). We als o fi nd that effect estimates from DNN can achieve acceptable Type I error rates and coverage of confidence interval s . Thi s shows that at least simple DNNs are not a completely different model class but seamlessly extend previous GLM - or GAM - based step - selection models for inferring habitat selection. The benefit of DNN - SSF over existing approaches is arguably most r elevant when we are faced with complex data and response s . Our results show th at in simulated data, the DNN - SSF was ab le to retrieve linear two - way interactions as well as a GLM - SSFs . The advantage of DNN - SSF s , however, is that these interactions did not have to be defined a priori , nor would they have to be linear . This be comes particularly evident when habitat selecti on variables interact with movement variables. These interactions are usually not assumed, as it is likely just not feasible given the combinatorially large number of interactions; however, in our case study, we found large interaction importance ( Figure 7 b ) between turning angle and habitat variables, highlighti ng that these inter actions should be given more focus in the futur e. W e also showed that the DNN - SSF was able to recover simple nonlinear functional forms as well as or better than a GAM - SSF . These r esults suggest that DNN - SSF s, due to their flexibility and ability to approximate complex functions, may aid ecologists in automatically infe rring the Deep neural networks for step selection anal ysis 22 complex functional relationships from the data. For instance, daily movement patterns of animals of ten exhibit complex, non linear patterns that can be influenced by a variety of factors such as temperature (Thaker et al. 2019), time of day (Klappstein et al. 2024), presence of predators (Fischhoff et al. 2007), or terrain (Avgar et al. 2013). Although we only tested nonlinear relationship s with one predictor , we anticipate that the DNNs excel particularly for a large number of nonlinear interactions . So far, those could only be specified using tensor splines in a GAM - SSF, but these structures are very costly and arguably less robust. A downside of the flexibility of DNN - SSF s is a potent ial loss of power. Model flexibi lity has therefore been balanced against the available data, based on the bias – variance trade - off (James et al. 2013). As model complexity increases, effective regularization becomes essential to control variance, but such regularization and optimization choices introduce small amounts of bias into the estimates (Zou and Hastie 2 005). Although our study does not focus on these aspects of the modelling process, the ‘ citoMove’ R package inherits many tunable regularization strategies (e.g., dropout, L2, and L1 ) from the ‘ cito ’ R package that help researchers control the bias – variance trade - off (Amesoeder et al. 2024b) . An additional advantage , which is to o ur knowledge , highlighted for t he first time in this study, is that DNNs provide interesting options to specify nonlinear random effects through an embedding layer linked to the group identity. These structures can be seen as generalizations of random - slope models (Dingemanse and Dochtermann 2013; Hertel et al. 2020; Chatterjee et al. 2024; Klappstein et al. 2024) or hierarchical GAMs (Pedersen et al. 2019) that were previously used to analyze variation in movement behavior among individual s . In the embedding approach , each individual or group is placed in an n - dimensional embedding space duri ng model training , and the location in this space then codes the type of environmental response, with individuals that are placed close to each ot her having similar (but potentially complex) environmental responses. The embedding position thus simultaneously encodes linear effects, nonlinearities, and interactions in one vector, and the similarity of groups can then be visualized similar to an ordinati on biplot . Using this approach, we showed that DNN - SSFs can effecti vely capture inter - individual variability in habitat selection effects. It is assumed that such among - individual variation in habitat - selection strategies is not only driven by external factors (such as habitat quality, predation risk, seasonal condition, etc.) but also arises due to differ ences in personality or behavioral state, experiences in the past, or physiology (Hertel et al. 2020). Accounting for such variation has long been recognized as essential for understanding the movement of animals (Shaw 2020), as it shapes predator - prey interactions (McGhee et al. 2013), dispersal (Clobert et al. 2009; Cote and Clobert 2010), fitness (Smith and Blumstein 2008) and thus Deep neural networks for step selection anal ysis 23 population dynamics (Del Mar Delgado et al. 2018; Shaw 2020). Our results show that DNN - SSFs offer a very appeali ng analytical pipel ine to detect such di fferences. Another interesting application of the embedding architecture is to reveal similarities in the effect an individual has on other individuals , which could help, for example , to study social structures (Langrock et al. 2014), responses in predator - prey systems or among competi t ors (Vanak et al. 2013) and group dynamics (leadership or foll owing dynamics - Strand burg - Peshkin et al. 2018 ). In a simulated case study, we showed that the DNN - SSF c an separate groups of individuals that act attracting, repellent , and neutral on other individuals. Exi sting methods, such as the framework by Schlägel et al. (2019), also allow to estimate the effect of an indivi dual on the movement behavior of other individuals. Howeve r, this approach relies on fitting a n SSF for each individual independe ntly , whereas our DNN - SSF embedding architecture allows to consider the entire group at once. We believe t hat th e approach presented here may help to uncover important mechanisms contributing to migration/dispers al, home - range analysis, group dynamics, or foraging strategies (Worton 1989; Jeltsch et al . 2013) . Although not tes ted here, this framework offers a nov el opportunity to incorporate statistical interactions between the group of the opponent and environmental predictors. This enables analysis of how the habitat selection of a focal individual varies depending on the group of nearby individuals. A pu rposeful limitation o f ou r st udy was that we concentrat ed on full y connected neural networks, also known as multi - layer perceptrons (MLP), which is the simple st neural network architecture. An obvious extension of our work, which was already consi dered in other papers, would be to extend the neur al networks to more ad vanced deep learning architectures. For example, convolutional layers could enable the direct use of raw spatial imagery (e.g. , LiDAR or sat ellite image) as movement pr edictors (Forrest et al. 2025b), rather than their summary statistics as predictors of the movement. Another option would be the use of recurrent neural networks or attention - based architectures that incl ude the past movement as context, thus in some sense creating an internal state of th e anima l that affects the movement decision (Cífka et al. 2023) . These extensions represent highly promising avenues as DL continues to evol ve, offering new opportunities for combining ecological inference with the flexibility of modern ML. However, in this study, we concentrat ed on the simpler MLPs so that the same predictors could be used and thus compare model performance to traditional SSF models. Conclusion Deep Learning approaches offe r a promising alternative to exi sting step - selection modeling frameworks for understanding drivers behind animal movements. In this study, we demonstrated that by inserting simple fully - connected DNNs within the familiar step - selection framework, we obtain a model that has similar properties to a GAM - SSF, in that it can infer Deep neural networks for step selection anal ysis 24 nonlinear relationships while retaining interpretability, p - values and confidence intervals. However, this new model offers greater flexibility than GAM - SSF regarding the specification of (nonlinear) interactions and provides excit ing opportunities to analyze inter - individual differences in selection effects (e.g. animal personality) and in the effect of individuals on other individuals. Moreover, once we are already in a deep learning framework, it is relatively straightforward to extend the modelling approach to more complicated network architecture such as C NNs or transformers. To facilitate the uptake of t he methods described in this paper, we provide them in the R package ‘citoMove’. Software All analyses and simulations were performed i n R 4.5 .2 (R Core Team 2025) with the packages ‘citoMove’, ‘survival’ (Therneau 2024) , ‘amt’ (Signer et al. 2019) , and ‘mgcv’ (Wood 2025) . Acknowledgements This work was supported by the German Research Fo undation (DFG) Research Tra ining Group “BioMove” (DFG - GRK 2118/2). We t hank Milena Stillfried for collecting the wild boar field data. We thank Aastha Tapaliya for the support with the preliminary exploratio n of how machine learning c an be used to inf er movement intera ctions. Data Avail abilit y Upon acceptance of the manuscript, we will make the code to re produce the an alysis presented in this paper available as a GitHub repository and produce a persistent snapshot of this repository via zenodo. The citoMove R package will be made freely available at latest upon acceptance of the manuscript, ideally via CRAN. Conflict of Int erest st atement The authors have no conf licts of interest to declare. Author Cont ributio ns Thibault Fronville , Maximilian Pichler, Viktoriia Radchuk, and Florian Hartig conceived the ideas and designed met hodology; Thibault Fronville simulated th e data; Thibault Fronville and Maximilian Pichl er analyzed t he simulated data. Marius Grabow and Steph anie Kramer - Schad t prepared the spatial data for the case study; Maximilian Pichler analyzed the field data; Thibault Fronville, Maximilia n Pi chler, Florian Hartig , Viktoriia Radchuk and Johannes Signer contributed to the development of the citoMove R package. Thibault Fro nville, Maximilian Deep neural networks for step selection anal ysis 25 Pichler, Viktori ia Radchuk, and Florian Hartig led the writing of the man uscript. All authors contributed critically to the drafts and gave f inal approval for publication. Deep neural networks for step selection anal ysis 26 References Amesoeder, Christ ian; Hartig, Florian; Pichler, Maxi milian (2024 a): ‘cito': an R package for training neural networks using ‘torch'. In Ecography (6), Article e07143. DOI: 10.1111/ecog.07143. Amesoeder, Christian; Pichler, Maxi milian; Hartig, Florian ; Schenk, Armin (2024b): cito. Building and Trainin g Neural Networks. Version 1.1: CRAN.R. Apley, Daniel W.; Zhu, Jingyu (2020): Visualizing the Effects of Predictor Variables in Black Box Supervised Le arning Models. In Journal of the Royal Statistical Society Series B: Statistical Methodolog y 82 (4), pp. 1059 – 1086. DOI: 10.1111/rssb.12377. Avgar, Tal; Mosser, Anna; Brown, Glen S. ; Fryxell, John M. (2 013): Environmental and individual drivers of animal movement patterns across a wide geographical gradient. In The Journal of animal ecology 82 (1), pp. 96 – 106. DOI: 10.1111/j.1365 - 2656.2012.02035.x. Avgar, Tal; Potts, Jonathan R.; Lewis, Mark A.; Boyce, Mark S. (2016): Integrated step selection analysis: bridging the gap between resource selection and animal movement. In Methods Ecol Evol 7 (5), pp. 619 – 630. DOI: 10.1111/2041 - 210X.12528. Bauer, S.; Hoye, B. J. (2014): Migra tory animals couple biodi versity and ecosystem funct ioning worldwide. In Science (New York, N.Y.) 344 (6179), p. 1242552. DOI: 10.1126/science.1242552. Bjørneraas, Kari; Solberg, Erling Johan; Herfindal, Ivar; van Moorter, Bram; Rolandsen, Christer Moe; Tre mblay, Jean - Pierre et al. (2011): Moose Alces alces habitat use at multi ple temporal scales in a human - altered landscape. In Wildlife B iology 17 (1), pp. 44 – 54. DOI: 10.2981/10 - 073. Borowiec, Marek L.; Dikow, Rebecca B.; Frandsen, Paul B.; M cKeeken, Alexander; Valentini, Gabriele; White, Alexander E. (2022 ): Deep learning as a tool for ecology and evolution. In Methods Ecol Evol 13 (8), pp. 1640 – 1660. DOI: 10.1111/2041 - 210X.13901. Bowler, Diana E.; Ben ton, Tim G. (2005): Ca uses and consequence s of animal dispe rsal strategies: relating individual behaviour t o spatial dynamics. In Biological reviews of the Cambridge Philosophical Society 80 (2), pp. 205 – 225. DOI: 10.1017/s1464793104006645. Boyers, Melinda; Parrini, Francesca; Owen - Smith, Norman; Erasmus, Barend F. N.; Hetem, Robyn S. (2019): How free - ranging ungulates with differing water dependencies cope with seasonal variation in temperature and aridi ty. In Conservatio n physiology 7 (1), coz064. DOI: 10.1093/conphys/coz064. Deep neural networks for step selection anal ysis 27 Browning, Ella; Bolton, Mark; Owen, Ellie; Shoji, Akiko; Guilford, Tim; Freeman, Robin (2018): Predicting animal behaviour using de ep learning: GPS da ta alone accuratel y predict diving in seabirds. In Methods Ecol Evo l 9 (3), pp. 681 – 692. DOI: 10.1111/2041 - 210X.12926. Chatterjee, Nilanjan; Wolfson, David; Kim, Dongmin; Velez, Juliana; Freeman, Smith; Bacheler, Nathan M. et al. (2024): Modelli ng in dividual variability in hab itat s election and movement using integrated step - selection analysis. In Methods Ecol Evo l 15 (6) , pp. 1034 – 1047. DOI: 10.1111/2041 - 210X.14321. Christin, Sylvain; Hervet, Éric; Lecomte, Nicolas (2019): Applications for deep learning in ecology. In Methods Ecol Evo l 10 (10), pp. 1632 – 1644. DOI: 10.1111/2041 - 210X.13256. Cífka, Ondřej; Chamaillé - Jammes, Simon; Liutkus, Antoine (2023): MoveFormer: a Transformer - based model for step - selection animal movement modelling. bioRxiv. DOI: 10.1101/2023.03.05.531080. Clobert, Jean; Le Galliard, Jean - François; Cote, Julien; Meylan, San drine; Massot, Manuel (2009): In formed d ispersal, he terogeneity in animal dispersal syndromes and the dynamics of spatially structured populations. In Ecology l etters 12 (3), pp. 197 – 209. DOI: 10.1111/j.1461 - 0248.2008.01267.x. Cote, J. ; Clober t, J. (2010): Risky dispersal: avoid ing ki n competi tion d espite uncertainty. In Ecology 91 (5), pp. 1485 – 1493. DOI: 10.1890/09 - 0387.1. Del Mar Delgado, Maria; Miranda, Maria; Alvarez, Sil via J.; Gurarie, Elieze r; Fagan, William F.; Penteria ni, Vince nzo et al. (2018): The importa nce of individual variatio n in the dynamics of animal collective movements. In Philosophical transactions of the Royal Society of London. Series B, Biological sciences 373 (1746). DOI: 10.1098/rstb.2017.0008. Dickie, Mel anie; McNay, Scott R.; Sutherland, Glenn D.; Cody, Michael; Avgar, Tal (2020): Corridors or risk? Movement along, and use of, l inear features varies predictably among larg e mammal predator and prey species. In The Journal of ani mal ecology 89 (2), pp. 623 – 634. DOI: 10.1111/1365 - 2656.13130. Dingemanse, Niels J.; Doch termann, Ned A. (2013): Quantifying individual v ariation in behaviour: mixed - effect modelling approaches. In The Journal of animal ecology 82 (1), pp. 39 – 54. DOI: 10.1111/1365 - 2656.12013. Dwivedi, Rudresh; Dave, Devam; Nai k, Het; Si nghal, Smiti; Omer, Ran a; Patel, Pankesh et al. (2023): Explainable AI (XAI): Core Ideas, Techniques, and Solutions. In ACM Comput. Surv. 55 (9), pp. 1– 33. DOI: 10.1145/3561048. Deep neural networks for step selection anal ysis 28 Falbel, Daniel; Lurasch i, Javier (2025): torch: Tensors and Neural Networks wit h 'GPU' Acceleration. Version 0.16.3: CRAN.R. Farmer, Jenny; Jacobs, Donald (2018): High t hroughput nonparametric probability density estimation. In PloS one 13 (5), e0196937. DOI: 10.1371/j ournal.pone.0196937. Fieberg, John; Sig ner, Joha nnes; Smit h, Brian; Avgar, Tal (2021): A 'How to' guide for interpreting parameters in habitat - selection analyses. In The Journal of animal ecology 90 (5), pp. 1027 – 1043. DOI: 10.1111/1365 - 2656.13441. Fischhoff, I. R.; Su ndaresan, S. R. ; Cord ingley, J.; Rubenstein, D. (2007): Habitat use and movements of plains zebra ( Equus burchelli) in response to pr edation danger from lions. In Behavioral Ecology 18 (4), pp. 725 – 729. DOI: 10.1093/beheco/ar m036. Forrest, Scott W.; Pagendam, Dan; Bode, Michael ; Drovandi, Ch ristopher; Potts, Jonathan R.; Perry, Justin et al. (2025a): Pre dicting fine - scale distributions and emergent spatiotemporal patterns from t emporally dynamic step selection simulations. In Ecography 2025 (2), Article e07421. DOI: 10.1111/ecog.07421. Forrest, Sc ott W.; Pagendam, Dan; Hassan, Conor; Pot ts, Jonat han R.; Drovandi, Christopher; Bo de, Michael; Hoskins, Andrew J. (2025b): Predi cting animal movement wi th deepSSF : A deep learning step selection framework. In Methods Ecol Evol 17, pp. 371 – 391. DOI: 10.1111/2041 - 210X.70136. Fortin, Daniel; Beyer, Hawthorne L. ; Boyce, Mark S.; Smith, Douglas W.; Duchesne, Thie rry; Mao, Julie S. (2 005): Wolves influence elk move ments: Behavior shapes a troph ic cascade in yellowstone national park. In Ecology 86 (5), pp. 1320 – 1330. DOI: 10.1890/04 - 0953. Guo, Cheng; Berkh ahn, Felix (2016 ): Entity Embedd ings of Categori cal Variables. arXiv. DOI: 10.48550/arXiv.1604.06737. Hastie, Trevor; Tibshirani, Robert; Friedman, J. H. ( 2009): The elements of statistical learning. Data mi ning, inference, and prediction. 2nd ed. New York NY: Sprin ger ( Springer series in statistics). Hertel, Anne G.; Nie melä, Petri T.; Ding emanse, Niels J.; Mue ller, Thomas (2020): A guide f or studying among - individual behavioral variation from m ovement data in the wild. In Movement ecology 8, p. 30. DOI: 10.1186/s40462 - 020 - 00216 - 8. Holzinger, Andreas; Saranti, Anna; Molnar , Christo ph; Biece k, Przemysla w; Samek, Wojciech (2022): Explainable AI Methods - A Brief Overview. In Andr eas Hol zinger, Randy Goebe l, Ruth Fong, Taesup Moon , Klaus - Robert Müller , Wojciech Samek (Eds.): xxAI - Beyo nd Explainable Deep neural networks for step selection anal ysis 29 AI, vol. 13200. Cham: Springer International Publ ishing (Lecture Notes in Computer Science), pp. 13 – 38. James, Gareth; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert (2013): An Introduction to Statistical Learning . New York, NY: Springer New Yor k (103). Jeltsch, Florian; Bonte, Dries; Pe'er, Guy; Reineking, Björn; Leimgruber, Peter; Balkenhol, Niko et al. (2013): Integrating movement ecology with biodiversity research - exploring new avenues to address spatiotemporal biodiversity dynamics. In Movement e cology 1 (1), p. 6. DOI: 10.1186/2051 - 3933 -1- 6. Klappstein, Natasha J.; Michelot, Théo; Fieb erg, John; Pedersen, Eric J.; Mills Flemming, Joanna (2024): Step selection functions with non - linear and random effects. In Methods Ecol Evol 15 (8), pp. 1332 – 1346. DOI: 10.1111/2041 - 210X.14367. Langrock, Roland; Hopcraft, J. Grant C.; Blackwell, Paul G.; G oodall, Victoria; King, Ruth; N iu, Mu et al. (201 4): Modell ing group dynamic animal moveme nt. In Methods Ecol Ev ol 5 (2), pp. 190 – 199. DOI: 10.1111/2041 - 210X.12155. Latombe, Guillaume; Fortin, Daniel; Parrott, Lael (2014): Spatio - temporal dynamics in the response of woodland caribou and moose to the passage of grey wolf. In The Journal o f animal ecology 83 (1), pp. 185 – 198. DOI: 10.1111/1365 - 2656.12108. Lesmerises, Rémi; St - Laurent, Martin - Hugues (2017): Not accounting for i nterindividual variability can mask habitat selection patterns: a case study on black bears. In Oecologia 185 (3), pp. 415 – 425. DOI: 10.1007/s00442 - 017 - 3939 - 8. McGhee, Katie E.; Pintor, Lauren M.; Be ll, Alison M. (2013): Reciprocal behavioral plasticit y and behavior al types during predator - prey interactions. In The American n aturalist 182 (6), pp. 704 – 717. DOI: 10.1086/673526. Michelot, Théo; Klappstein, Nata sha J.; Potts, Jonathan R.; Fieberg, John (2 024): Understanding step selection analysis through numerical integration. In Methods Ecol Evol 15 (1), pp. 24 – 35. DOI: 10.1111/2041 - 210X.14248. Molnar, Christ oph; Casalicchio , Giuseppe; B ischl, Bernd (2020): Interpretable Machine Learning -- A Brief History, State - of - the - Art and Challenges. arXiv. DOI: 10.48550/arXiv.2010.09337. Muff, Stefanie; Signer, Johann es; Fieberg, John (2020): Accounting for individual - specific variation in habit at - selection studies: Efficient estimation of mixed - effects models using Bayesian or frequ entist c omputation. In The Journal of animal ecology 89 (1), pp. 80 – 92. DOI: 10.1111/1365 - 2656.13087. Deep neural networks for step selection anal ysis 30 Mysterud, Atle; Ims, Rolf A nker (1998): Functional respons es in habitat use: Availability influences relative use in trade - off situations. In Ecology 79 (4), pp. 1435 – 1441. DOI: 10.1890/0012 - 9658(1998)079[1435:FRIHUA]2.0.CO;2. Nathan, Ran; Getz, Wayne M.; Revilla, Elo y; Holyoak, Marcel; Kadmon, Ronen; Salt z, David; Smouse, Peter E. (2008): A movemen t ecolog y paradig m for unifying organismal movement research. In Proceedings of the Natio nal Aca demy of Sciences of the United States of America 105 (49), pp. 19052 – 19059. DOI: 10.1073/pnas.0800375105. Norouzzadeh, Mohammad Sadegh; Ngu yen, Anh; Kosmala, Margaret; Swa nson, Alexandra; Palmer, Meredi th S.; Packer, Craig; Clun e, Jeff (2018): Automatically identifying, counting, and describing wild animals in camera - trap images with deep learning. In Proceedi ngs of the National Academy of Sciences of the United States of America 115 (25), E5716 - E5725. DOI: 10.1073/pnas.1719367115. Pedersen, Eric J.; Miller, Da vid L.; Simpson, Gavin L.; Ross, Noam (2019): Hierarchical generalized additive models in ecology: an introduction with mgcv. In PeerJ 7, e6876. DOI: 10.7717/peerj.6876. Pichler, Maximil ian; Hartig, Florian (2023a) : Can predictive model s be used for causal inference? arXiv. DOI: 10.48550/arXiv.2306.105 51. Pichler, Maximilian ; Hartig, Florian (2023b): Machine learning and d eep learning — A review for ecologists. In Methods Ecol Evol 14 (4), pp. 994 – 1016. DOI: 10.1111/2041 - 210X.14061. R Core Team ( 2025): R. R: A Languag e and Environment for Statistical Computing. Version R 4.5.2: R Foundation for Statistical Computing. Available online at https:/ /www.R - project.org/. Rhodes, Jona than R.; McAlpine, Clive A. ; Lunney, Daniel; Possingham, Hugh P. (2005): A spatially explicit habitat selection model incorporating home range behavior. In Ecology 86 (5), pp. 1199 – 1205. DOI: 10.1890/04 - 0912. Richter, Laura; Balkenhol, Niko; Raab, Christoph; Reinecke, Horst; Meißner, Marcus; Herzog, Sven et al. (2020): So close and yet so different: The importan ce of considering temporal dynamics to understand habitat selection. I n Basic and Ap plied Ecology 43, pp. 99 – 109. DOI: 10.1016/j.baae.2020.02.002. Ryo, Masahiro; Angelov, Boyan; Mammola, Stefano; Kass, Jamie M.; Benito, Blas M.; Hartig, Florian (20 21): Explain able artifi cial inte lligence enhances the ecological interpretability of black - box speci es di stribution models. In Ecography 44 (2), pp. 199 – 205. DOI: 10.1111/ecog.05360. Deep neural networks for step selection anal ysis 31 Schlägel, Ul rike E.; Signer, Johannes; Herde, Antje; Eden, Sophie; Jeltsch, Florian; Eccard, Jana A.; Dammhahn, Melanie (2019): Estimating interactions between individuals from concurrent animal movements. In Methods Ecol Evol 10 (8), pp. 1234 – 1245. DOI: 10.1111/2041 - 210X.13235. Shaw, Allison K. (2020 ): Causes and consequen ces of i ndividual variation in animal movement. In Movement ecology 8, p. 12. DOI: 10.1186/s40462 - 020 - 0197 - x. Shmueli, Galit (2010) : To Explain or to Predic t? In Statist. Sci. 25 (3). DOI: 10. 1214/10 - STS330. Signer, Johannes; Fieberg , John; Avgar, Tal (2019): An imal movement tools (amt): R pack age for managing tracking data and conducting habitat selection analyses. In Ecology and evolution 9 (2), pp. 880 – 890. DOI: 10.1002/ece3.4823. Signer, Johannes; Scherer, Cédric; Radchuk, Viktoriia; Scholz, Caroli n; Jeltsch, Fl orian; Kramer - Schadt, Step hanie (2025 ): The 4th Dimensi on in Animal Mov ement: The Effect of Temporal Resolution and Landscape Config uration in Habitat - Selection Analyses. In Ecology and evolution 15 (5), e71434. DOI: 10.1002/ece3.71434. Sloan Wilson, D.; Clark, A. B.; Col eman, K. ; Dears tyne, T. (19 94): Sh yness and b oldness in humans and other animals. In Trends in ecology & evolution 9 (11), pp. 442 – 446. DOI: 10.1016/0169 - 5347(94)90134 - 1. Smith, Brian R.; Blumstein, Dani el T. (2008): Fitness conse quences of personality: a meta - analysis. In Behavioral Ecolo gy 19 (2), pp. 448 – 455. DOI: 10.1093/beheco/arm144. Stillfried, Milena; Gras, Pierre; Busch , Matth ias; Börner, Konstantin; Kramer - Schadt, Stephanie; Ortmann, Sylv ia (2017): Wild in side: Urban wild b oar select natural , not anthropogenic food resources. In PloS one 12 (4), e0175127. DOI: 10.1371/journal.pone.0175127. Strandburg - Peshkin, Ari ana; Papageorgiou , Danai; Crof oot, Margaret C.; Fari ne, Damien R. (2018): Inferring influence and leadership in moving animal groups. In Philosophica l transactions of the Royal Society of London. Series B, Biological sciences 373 (1746) . DOI: 10.1098/rstb.2017.0006. Tabak, Michael A.; Norouzzadeh, Mohammad S.; Wolfson, Da vid W.; Sweeney, St even J.; Vercauteren, Kurt C.; Sn ow, Nathan P. et al. (2019): Machine l earning to classify animal species in camera trap images: Applications in ecology. In Methods Ecol Evol 10 (4), pp. 585 – 590. DOI: 10.1111/2041 - 210X.13120. Deep neural networks for step selection anal ysis 32 Thaker, Maria; Gupte, Prat ik R.; Prins , Herbe rt H. T.; Slotow, Rob; Vanak, Abi T. (2019): Fine - Scale Tracking of Ambient Temperature and Movement Reveals Shuttling Behavior of Elephants to Water. In Front. Ecol. Evol. 7, Article 4. DOI: 10.3389/f evo.2019.00004. Therneau, Terry M. (2024): A Package for Survival Analysi s i n R. R package versi on 3. 8 - 3. Version 3.8 - 3: CRAN.R. Available online at https://CRAN.R - project.or g/package=survival. Therneau, Terry M.; Grambsch, Patricia M. (2000): Modeling Survival Data: Exte nding the Cox Model. New York, NY: Springer New York. Thurfjell, Henri k; Ciuti, Simone; Boyce, Mark S. (2014): Applications of step - selection functions in ecology and conservatio n. In Movement ecology 2 (1), p. 4. DOI: 10.1186/2051 - 3933 -2- 4. Torney, Colin J.; Morales, J uan M.; Husmeier, Dirk (2021): A hi erarchical machine lea rning framework for the analysis of large scale animal movement data. In Movement ecology 9 (1), p. 6. DOI: 10.1186/s40462 - 021 - 00242 - 0. Vanak, Abi Tamim; Fortin, Daniel; Thaker, Maria; Ogden, Monika ; Owen, Cai ley; Greatwood, Sophie; Slotow, Rob (2013): Moving to stay in place: behavioral mechanisms for coexistence of African large carnivores. In Ecology 94 (11), pp. 2619 – 2631. DOI: 10.1890/13 - 0217.1. Wang, Guiming (2019) : Mach ine l earning for infer ring animal behavi or fr om lo cation and movement data. In Ecological Informatics 49, pp. 69 – 76. DOI: 10.1016/j.ecoinf.2018.12.002. Wijeyakulas uriya, Dhanushi A.; Eisenhauer, Elizabet h W.; Shaby , Benjamin A.; Ha nks, Ephraim M. (2020): Mac hine learning f or modeling animal movement. In PloS one 15 (7), e0235750. DOI: 10.1371/journal.pone.0235750. Wood, Simon (2025 ): mgcv. Mixed GAM Com putation Vehi cle with Automat ic Smoothness Estimation. Version 1.9 - 4: CRAN.R. Wood, Simon N. (20 11): Fast Stable Restricted Max imum Likelih ood and Margina l Likelihoo d Estimation o f Semiparametri c Generalize d Linear Models. In Journal of the Royal Statistical Society Series B: Statistical Methodology 73 (1), pp. 3– 36. DOI: 10.1111/j.1467 - 9868.2010.00749.x. Worton, B. J. ( 1989): Kernel Methods for Estima ting the Utilizatio n Di stribution in Home - Ran ge Studies. In Ecology 70 (1), pp. 164 – 168. DOI: 10.2307/ 1938423. Zou, Hui; Hastie, Trevor (2005): Regulariza tion an d Variabl e Selec tion Vi a the Elastic Net. In Journal of the Royal Statistical Society Series B: Statistical Methodology 67 (2), pp. 301 – 320. DOI: 10.1111/j.1467 - 9868.2005.00503.x. Deep neural networks for step selection anal ysis 33 Supplementary Material s Analyzing animal movement using deep learning S1. Simulation settings In our study, we simulated movement using a step - selection framework in which at each time step, an i ndividual selects its next location from a set of randomly generated available steps, with selection drive n solely by environment al covariates. At each step, we generated a set of covariates fo r all the available st eps. For each of the environmental covariates we sampled random values from a uniform distribut ion: 𝑥 ' ∼ 𝑈 ( 0,1 ) equation S1 To enhance interpretability an d reduce collinearity in the presence o f interaction terms, we centered the covariates prior to their usage in t he selection model. To get the selection probabil ity 𝑝 ( 𝑖 ) we first nee d to calculat e the selecti on score for each available step: 𝑤 3 = exp 3 ∑ 𝛽 6 𝑥 36 6 + ' ∑ 𝛾 7 𝑥 3 * 87 𝑥 3 * 97 7 4''''' equation S2 where 𝑥 36 is the j - th covariate value for the i - th available step and 𝛽 6 ' is their corresponding selection coefficient. ' 𝛾 7 is the coefficient for the 𝑘 - th interaction between covariates 𝑥 87 and 𝑥 97 . 𝑤 3 corresponds to the relative habitat selection strength (also called unnormalized weights) of the avai lable step 𝑖 . Nonlinear relationships can be incorporated by including transformed covariates among the p redictors 𝑥 36 . The selection probability for step 𝑖 is then obtained by no rmalizing the we ights for each step 𝑖 over all the available weights: 𝑝 ( 𝑖 ) = $ % : $ % & %'('# equation S3 Finally, th e true next step was selected by drawing from a mul tinomial di stribution wi th probabilities 𝑝(𝑖 ) . This ensures that the steps with higher selection strengths are more likely to be chosen, while allowing stochastic variation in the selection process. S2. DNN - SSFs are good at recoverin g statisti cal intera ctions between many pred ictors Our third scenario was designed to compare DNN - SSFs and GLM - SSFs in their ability to infer linear main effects and linear interactions from a larger number of predictors (9 predictors, 9*8/2 = 36 possible interactions). Our results show t hat both models were able to a ccurately identify the main effects, as indicated by a high alignment of the circle sizes on the diagonal with the respective colored Deep neural networks for step selection anal ysis 34 point si zes and by the low Variance values (Figure S1). We do not find pronounced differences between the two modeling options. For the interactions, both models i dentified the three true interactions as i mportant predictors. The DNN - SSF tended to sligh tly under estimate the effect of the true interacti ons (cir cles are not filled), while the GLM - SSF tended to show more variance (Fig ure S1) in effect est imates for interactions. This pattern is in line with the expectation that a DNN creates a shrinkage bias, which means that the model t ends to push all interaction estimates towards smaller values and thus trades off bias for reduced v ariance. Figure S1 : A matrix - based visualization of pairwise interaction effects among predictors (x1 to x9), comparing performance between a DNN - SSF (left panel) and a GLM - SSF (right panel). The diagon al repres ents the main effect of the 9 pr edictors. The other cells repre sent a specific interaction term between two predictors. Interactions were present between x1 - x8, x3 - x7 and x4 -x6. The true eff ect size is indicat ed by the size of th e black circle wh ile the estimated size is indicated by the size of the colore d points. T he variance of the estimated effect is indicated by color. Sign information (“+” or “−”) insi de circles show the direction of the true effect.

Analyzing animal movement using deep learning

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment