TickTalk -- Timing API for Dynamically Federated Cyber-Physical Systems

1 Perspecti v e P aper: T ickT alk - T iming API for Dynamically Federated Cyber -Physical Systems Bob Iannucci † , A viral Shri vasta va ∗ and Mohammad Khayatian ∗ † Carnegie Mellon Uni versity , ∗ Arizona State Uni versity bob@sv .cmu.edu, a viral.shri v asta v a@asu.edu, mkhayati@asu.edu Abstract —Although timing and synchronization among a dynamically-changing set of sensing, computing, and actuating elements and their related power considerations ar e essential to many cyber-ph ysical systems (CPS), these concepts are absent from today’ s programming languages, forcing programmers to handle these matters outside of the language and on a case-by- case basis. This paper proposes a framework for adding time- related concepts to languages. Complementing prior work in this area, this paper develops the notion of dynamically federated islands of variable-pr ecision synchronization and coordinated entities through synergistic activities at the language, system, network, and device levels. At the language level, we explore con- structs that capture key timing and synchronization concepts. At the system level, we propose a ﬂexible intermediate language that repr esents both program logic and timing constraints together with run-time mechanisms. At the network lev el, we argue for architectural extensions that permit the network to act as com- bined computing, communication, storage, and synchronization platform. At the device lev el, we explore ar chitectural concepts that can lead to greater interoperability , the easy establishment of timing constraints, and more power -efﬁcient designs. I . I N T R O D U C TI O N Our imagination and concepts of Cyber -Physical Systems (CPS) are transforming our vision of the Internet of Things (IoT), Internet of Everything (IoE) and so-called smart cities. The concepts are simultaneously appealing and puzzling. The appeal comes from the ability to apply computing and commu- nications technologies in numerous ways and on a wide scale to improve life. The puzzling aspect is how to achieve it. If we can sense anything, and actuate an ything, what useful things can we do? One e xample is to empo wer anyone to track people and things v aluable to them (their child on a bicycle, a truck, stolen property) using information gleaned from a smart city’ s collectiv e pool of sensors and to initiate some appropriate actions. There are many similar time-sensiti ve, distributed computing tasks in a smart city or other IoT networks that in volve interaction with spatially distributed nodes [1]. Many such applications written by various programmers should be able to share the city-wide CPS infrastructure. Thus each CPS node will accept snippets of code from separately-created programs to run in a coordinated way with other nodes in the system. For instance, one programmer may be interested in taking a picture at 4:00 pm while another programmer is interested in sensing the temperature in the en virons of the same CPS node at 4:00 pm. How do we make all this possible, especially when the programs are being de veloped separately and without coordination? How do we kno w , for e xample, if the combined functionality is even possible? How do we make programming these geographically distributed time-sensitive systems easier? Programming CPS is difﬁcult because it combines the complexities of distributed programming and time-sensitiv e programming – both of which bring portability and scalability issues [2]. What is a good distrib uted-timing application programming interface (API) that can ease this burden? Can such an API offer clean semantics for reasoning about the time-related behavior of these programs? In this paper , we explore these questions. I I . N E E D F O R T I M I N G A N D S Y N C H R O N I Z A T I O N A P I Consider the example of a smart city in which a transporta- tion company wants to “observe” one of its assets, in this case, an en-route truck. Imagine that they hav e the authority to dynamically recruit pools of separately-installed and - managed cameras that may be found around the smart city ( e.g ., on buildings, poles) to get a 3D video view of the truck’ s mov ement. These scattered CPS nodes can be fused to become a federated cyber -physical system (FCPS) which can sense, compute, communicate and actuate as an integrated whole. But even the simple notion of collecting video from cameras near the truck will in volve enrolling more cameras ov er time as the truck mov es. W e call such a system a ( dynamically) federated cyber -physical system (DFCPS). Figure 1 depicts the truck moving around the city and ho w the notion of “nearby cameras” must e volv e. Network boundaries between sets of separately-managed cameras are depicted with solid green lines. The dotted blue line shows the trajectory of the truck. Fig. 1. Dynamically enrolling separately-managed cameras scattered around a city to track a truck 2 loop { // assume (x,y) is the predicted position of the object S = getSensors( x, y , 100 m ); // get sensors within 100 meters of (x,y) A = emptySet(sizeOf( S )); // empty set of images withSynchronization( S , 1 us , self) { // within this block, the sensors will synchronize to 1 us accuracy a = simultaneously( S .captureImage()); } 3DImage = create3DImage( A ); 3Dmodel.addImage(3DImage); ( x 0 , y 0 ) = predictNe xtPosition( x, y , A, t ); // new predicted position if (( x 0 , y 0 ) == ( x, y )) break; else ( x 0 , y 0 ) = ( x, y ); } // end loop Fig. 2. Pseudo-code for tracking a moving object using scattered cameras in the city . One approach to dynamic federation is to impose some sort of hierarchy – within co-located clusters, one node can serve as a leader to be responsible for in-cluster synchronization and global communication. This leader could arrange for the other nodes to take pictures at speciﬁed times. The leader gathers these frames and constructs a dynamic 3D model from which estimates of the future position of the object can be made. The typical approach would be to write application-speciﬁc code for each participating node together with code to coordi- nate the actions of these nodes. V eriﬁcation and validation of separate applications w orking together are challenging because it necessitates the speciﬁcation of the ov erall system’ s behav- ior . T esting is like wise challenging. A better approach and, we dare say , one that might be more acceptable to the millions of programmers who might inv est their efforts in the creation of such smart city apps, is to develop one inte grated application that can be veriﬁed and then, separately , decomposed and distributed to the nodes on which it will run. A representative pseudocode is shown in Figure 2. While the pseudocode may seem simple, it highlights important opportunities and challenges that emerge from the very nature of programming a geographically-distributed aggregate of computing resources. In this section, we will discuss challenges of adding timing concepts to the programming language as well as achie ving synchronization for a scattered time-sensiti ve system. A. T ime-related Pr ogr amming and Sync hronization T o achiev e deterministic timing on IoT/CPS de vices, timing must be made a correctness criterion and not just a perfor- mance factor . Hence, by making timing constraints [3] and requirements part of the formal model/program, reasoning about and v erifying timing requirements becomes possible. Correctly deﬁned, the semantics of timing primiti ves in speciﬁ- cation models determine whether correctness properties can be checked by inherent construction, symbolic analysis, explicit simulation, or only in the implementation. Howe ver , as long as the timing speciﬁcation is not a part of the programming language, whether a system implementation meets the timing requirements or not, can only be check ed by testing after building the whole system. Programmers of the future will have to make the system achiev e correct timing even though today’ s popular languages lack mechanisms for expressing the needed time-related concepts. For example, in C, we lack the follo wing concepts: printf(’’hello world \ n’’, @4:35 PM); Even if programmers had such expressi ve po wer , making good on their intent will require new mechanisms in the underlying network and de vices. For applications similar to our example, nearby CPS nodes only need to be synchronized among themselves and do not need to be synchronized to the time server nor to coordinated uni versal time (UTC). W e simply need to create the conditions under which the y all take photos essentially at the same instant. Other applications may need synchronization to an external reference ( e .g., UTC). For instance, a user may be interested in polling data at exactly 11:00 AM. Therefore, all sens- ing/actuating nodes of FCPS must hav e a common understand- ing of the real time and must ex ecute the sensing/actuating code exactly at the speciﬁed time regardless of worst-case ex ecution time (WCET) of the underlying computation plat- form [4], network delay , local clock drift, and so on. Since time-related concepts are absent from today programming languages, programmers must handle these matters outside of the language. B. Cost-P ower Ef ﬁciency As sensors proliferate in a smart city , the cost of provid- ing each one with a wired po wer connection will become ov erwhelming [5]. Devices that operate for years on batteries and/or harvest energy will be preferred. A very closely related issue for small in-the-en vironment sensors and actuators is the power -cost of achie ving time aw areness. T ime synchro- nization between two or more devices necessitates frequent communication. As the need for precision increases, so does the power -cost of achieving it. Further , we must accept that energy-constrained de vices must be mostly off , implying that such devices either need to in vest po wer in precise, lo w-drift internal clocks or plan to wake up “just in time” to perform 3 ov er-the-netw ork resynchronization. It is likely to be power - advantageous for nodes to continuously be loosely synchro- nized to UTC–with just enough po wer put into the local clock to enable in-time wakeup, wireless synchronization, and the on-time capturing of the picture (after which time-keeping can rev ert to low precision). It will be important to dev elop po wer-ef ﬁcient solutions for implementing the behind-the-scenes (runtime) mechanisms that will reliably achieve application-lev el timing precision at the lowest possible power levels and without introducing jitter or timing errors. C. Support for Heter ogeneity Returning to our truck-tracking example, it unlikely that the borr owed cameras will be identical–same vendor , same programming interface, same functionality , same performance. Rather , in the information-sharing economy of this smart city , cameras and other sensors are likely to be dissimilar . It is already difﬁcult enough to imagine fusing the data, but DFCPS compounds the programmer’ s challenges by requiring the management of timing across an array of dissimilar de- vices. This is an uncommon, rather than a common, software engineering skillset. W e are, then, left with the conundrum that the domain (DFCPS) requires programmer-management of time, yet programmers and their tools (languages, com- pilers, run-time systems) are ill-prepared for this. V iewing this as an architectural problem, we seek a solution that only requires that the programmer specify the timing intent ( do these thr ee things at the same time ) and leav es to the run- time mechanisms the realization of these requirements. As we will see, this necessitates some minimal augmentation of the timing mechanisms in computing and communications equipment. Precision protocols for network-based transport of time information ( e.g., PTP/IEEE-1588) similarly identify the need for speciﬁc hardware support. D. Multi-T enancy: Code Bloc k Multiplexing Our tracking example and others suggest that signiﬁcant value will be deri ved from recruiting sensors/actuators dynam- ically , and making them sense or take action in a synchronized fashion. As the value of the smart city catches on, our programmer won’ t be the only one using the cameras. Many apps in this smart city will likely want to concurrently share some or all of the cameras. At that point, our carefully- synchronized application will be faced with the challenge of sharing the hardware with other concurrent applications. While sharing an IoT device ( e.g., a motion sensor, a camera) across applications may seem simple, different applications will impose differing and possibly conﬂicting requirements with respect to time. At a minimum, we imagine the need to discover potential conﬂicts, harmonize them when possible and signal irreconcilable conﬂicts otherwise. As a starting point, if programmer-deri ved timing require- ments are expressed cleanly and clearly , we can imagine compiling and run-time tools that will enable this sorting-out of separately-created timing constraints that come together on a single hardware device. But there are subtle complications. Imagine that code block b 1 seeks to run on a giv en node (say , one of our cameras) and, per the programmer’ s intent, it is synchronized to some reference clock c 1 . But along, comes code block b 2 from a dif ferent application that also seeks to run on this same camera. Alarmingly , its programmer has elected to use a different and possibly incompatible (with c 1 ) reference clock c 2 . These clocks may differ in frequency , phase and/or epoch–and for good reasons known to their programmers. In this case, we must go beyond considerations of simply sharing the computing resources (by traditional virtualization techniques, for instance) to embracing the notion that clocks themselves must be virtualized–allo wing for a separate clock per code block. The run-time system must somehow deal with issues of non-synchronization among these reference clocks. I I I . O U R A P P RO AC H Our approach advances the concept of an easily- programmed federated cyber -physical system (FCPS) that hides the inherent comple xities of synchronization of dis- tributed actions. W e model a FCPS as a tuple (C, E, B) in which: C = { c 1 , c 2 , ... } is the set of reference clocks. each clock is characterized by its frequency , phase, jitter, etc. E = { e 1 , e 2 , ... } is the set of computing, storage, actuating and sensing ensembles (think: nodes. More on this belo w .), each of which has its own set of local clocks. B = { b 1 , b 2 , ... } is the set of computational blocks (program fragments) within which, actions can be scheduled to take place at speciﬁc times. W e use the term ensemble to capture the notion of an ele- ment that has computing, storage, communication and timing capabilities that allow it to accept one or more code blocks. It is worth noting that our notion of ensemble is intentionally broad and is intended to abstract the hardware for sensor and actuator nodes (including the computing, storage, and communication chips associated with them), network-resident computing facilities such as would be necessary to implement fog computing or cloudlets, and cloud computing equipment such as would be found in large, virtualized data centers. W e speciﬁcally contemplate the additional possibility of dy- namically migrating code blocks from ensemble to ensemble, implying a notion of common base functionality . W e use the term ensemble instance to denote an in vocation of a code block on a particular ensemble with a particular ensemble- local clock. By characterizing ensembles in this way , we enable the possibility of taking a single program and breaking it into pieces that run concurrently in the cloud, in the network, and the de vices. Note that this is different than the traditional model in which the cloud code is written by one team, the device code is written as part of the development of a power - constrained embedded system, and the network is lar gely un- programmable by non-specialist developers. W e imagine, as a possible outcome of this research, the creation of a reference architecture for ensembles that could aid in assuring growing interoperability among future smart city elements. 4 Fig. 3. The proposed architecture. High-level programs are decomposed to an intermediate-level form in which time-based operations are explicitly represented. As depicted in Figure 3, programs written in a suitable high-lev el language with constructs will be translated into a dataﬂow graph. Dataﬂo w provides a clear dependency-driv en graph interpretation framew ork to which we seek to add refer- ence clock synchronization semantics. A simple formulation is to decompose FCPS meta-programs into graphs in which each node represents the instance of a code block on a speciﬁc ensemble. Synchronization and simultaneity dependencies to reference clocks can be explicitly represented. Synchroniza- tion, when established, will yield tokens that become part of the ﬁring rules for the respective nodes, while simultaneity has to be ensured by programming and analysis. Ensembles are depicted in green. As an example, a sub-domain inv olving a network ensemble, a sensor ensemble, and an actuator ensemble is highlighted in red. System operations such as code block placement and synchronization are handled by the Run- T ime Manager (R TM). Feedback from network elements to the R TM f acilitate improved synchronization (dashed arrow). At the language level, timing semantics for functionality that are commonly used can be cate gorized as [3]: Frequency-based sensing/actuating. Utilization of periodic actions is very common in IoT applications and is character- ized as “take an action ev ery x nano, Micro or Milliseconds. ” Certain frequencies of sensing or actuating are required to achiev e desired Quality of Control (QoC). Syntonization and Synchronization. Certain levels of syn- chronization are required for many applications, howe ver , high precision time synchronization in a large scale system will cause network traf ﬁc and consequently network delay is less predictable. This problem can be addressed by deﬁning variable synchronization le vels for ensembles. Simultaneous sensing/actuating. Performing two or more concurrent actions is very common in Multi-Agent Systems which are widely used for different purposes like Distributed Learning and Problem Solving, Decentralized Control, For- mation Control, and the like. Hence, as a functionality , the application must push code blocks into ensembles so that desired actions be taken simultaneously . Latency-based sensing/actuating. In time-sensitiv e appli- cations, sensor information and results computed from the sensors are valid for only a speciﬁc temporal interval, necessi- tating bounded or ﬁxed latency constraints on communication and computation. T imeliness, or the temporal limits of the application to communicate information or ex ecute an action can be described through latency-based speciﬁcations. At the network lev el, in assimilating information across a smart city , the nanosecond-scale of computation is dwarfed by the tens- to hundreds of milliseconds needed to trav erse network connections across a city . The worst-case round-trip time for a cyber -physical control loop (sensing, computing, and acting) can easily exceed 1000 milliseconds, making classical cloud-based cyber-physical systems useless for cases requiring response times in the deep sub-second regime. One important and promising approach to reduce CPS latency when mobile networks are in volved is moving the computation into the network itself to expose the trade-off between the network latency , amount of computation on end-de vices, and the network bandwidth requirements. Cloudlets [6] and fog computing [7] ha ve motiv ated research in this area. While programmer speciﬁcation of timing requirements is necessary , it is not sufﬁcient. The nodes and network impose constraints. As such, we seek to extract information about both latency and latency v ariability in real time from the FCPS and to feed this information in a usable form back to the programmer . W e argue that most realistic networks exhibit time-varying behavior and that knowledge of the current state of the netw ork can be used in dynamically optimizing ho w a distributed program w orks. I V . A C K N OW L E D G E M E N T This material is based upon work supported by the National Science Foundation under grants no. CPS 1646235 and CPS 1645578. R E F E R E N C E S [1] M. W eiss, J. Eidson, C. Barry , D. Broman, L. Goldin, B. Iannucci, and K. Stanton, T ime-aware applications, computers, and communication systems (T AACCS) . NIST , 2015. [2] A. Shriv astav a et al. , “Time in cyber -physical systems, ” in 2016 Interna- tional Confer ence on Har dware/Softwar e Codesign and System Synthesis (CODES+ ISSS) . IEEE, 2016, pp. 1–10. [3] M. Mehrabian et al. , “T imestamp temporal logic (ttl) for testing the timing of cyber-physical systems, ” ACM T ransactions on Embedded Computing Systems (TECS) , vol. 16, no. 5s, p. 169, 2017. [4] R. Wilhelm et al. , “The W orst-Case Execution-T ime Problem ;Overview of Methods and Surv ey of T ools, ” ACM T rans. Embed. Comput. Syst. , vol. 7, no. 3, pp. 36:1–36:53, May 2008. [5] B. Iannucci and A. Ro we, “Cro wdsourced Smart Cities, ” in Intelligent T ransportation Society of America (ITS) W orld Congr ess , (to appear), Ed., Montr ´ eal, 2017. 5 [6] M. Satyanarayanan, Z. Chen, K. Ha, W . Hu, W . Richter, and P . Pillai, “Cloudlets: at the leading edge of mobile-cloud con ver gence, ” in Mobile Computing, Applications and Services (MobiCASE), 2014 6th Interna- tional Confer ence on . IEEE, 2014, pp. 1–9. [7] F . Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog computing and its role in the internet of things, ” in Proceedings of the ﬁrst edition of the MCC workshop on Mobile cloud computing . A CM, 2012, pp. 13–16.

TickTalk -- Timing API for Dynamically Federated Cyber-Physical Systems

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment