Learning a Physical Activity Classifier for a Low-power Embedded Wrist-located Device

Learning a Physical Activity Classiﬁer for a Low-po wer Embedded Wrist-located Device Ricard Delgado-Gonzalo 1 , Philippe Rene ve y 1 , Adrian T arniceriu 2 , Jakub Parak 3 , and Mattia Bertschi 1 Abstract — This article presents and evaluates a nov el algo- rithm for lear ning a physical acti vity classiﬁer for a lo w-power embedded wrist-located device. The overall system is designed for real-time execution and it is implemented in the commer cial low-power System-on-Chips nRF51 and nRF52. Results were obtained using a database composed of 140 users containing more than 340 hours of labeled raw acceleration data. The ﬁnal pr ecision achiev ed f or the most important classes, (Rest, W alk, and Run), was of 96%, 94%, and 99% and it generalizes to compound activities such as XC skiing or Housework. W e conclude with a benchmarking of the system in terms of memory footprint and po wer consumption. I . I N T RO D U C T I O N Consumer wearable devices are a growing market for monitoring physical activity , sleep, and ener gy expenditure. These de vices are the main promoter of the Quantiﬁed Self movement, engaging those who wish to track their own personal data under the premise to improv e healthy behaviors. These de vices provide feedback to the wearer through device-speciﬁc interfaces ( e.g. , smartphones, web services). Some solutions e ven pro vide the means to compare against one’ s peers or a broader community of users, both of which are useful at increasing ov erall physical acti vity through peer-pressure. Originally used by sports and ﬁtness enthusiasts, ﬁtness trackers are no w becoming popular as an ev eryday exercise measurer and motiv ator for the general public. Step counters can give encouragement to compete with oneself in getting ﬁt and losing weight [1], [2]. In the context of human kinetics, wearable de vices aim at detecting, classifying, or proﬁling the kinetic information of the wearer gathered most often through inertial sensors [3]. These three capabilities are not always present in all systems and are not mutually exclusi ve either . Among the different inertial sensors, accelerometers hav e been sho wn to be the most adapted sensors for a robust recognition of physical activities in wearables systems [4]. Their success and their mainstream usage can be attributed to the fact that they represent a good balance between kinetic information ac- quired, power consumption, cost, and miniaturization. In laboratory settings, the most prev alent everyday activities (resting, walking, and running) have been successfully rec- ognized with high precision and recall [5], [6]. Ho wev er, 1 R. Delgado-Gonzalo, Ph. Renev ey , and M. Bertschi are with the Swiss Center for Electronics and Microtechnology (CSEM), Neuch ˆ atel, Switzer- land; e-mail: ricard.delgado@csem.ch . 2 A. T arniceriu is with PulseOn SA, Neuch ˆ atel, Switzerland; e-mail : adrian.tarniceriu@pulseon.com . 3 J. P arak is with PulseOn Oy , Espoo, Finland; e-mail : jakub.parak@pulseon.com . one has to be careful at extrapolating these results to out- of-lab monitoring due to the high variability of real-life activities. Direct applicability of the performance results has been challenged in several studies [7]. For example, in [8] the recognition accuracy of nine patterns decreased from 95.8% to 66.7% as the recordings were shifted outside the laboratory . In the literature, many authors take a principled approach to algorithm design. That is, the activity classiﬁcation is deriv ed from a carefully hand-pick ed list of features extracted from the inertial system. These features hav e a physical meaning that can be exploited by experts in the ﬁeld to construct a classiﬁer based on well-understood physics. This approach provides good results in protocoled scenarios and in-lab conditions. Howe ver , the algorithms based on this approach do not always enjoy the lev el of generality required for consumer products since they do not take into account the variability in a large user base. An alternative approach that has been taking steam during the last decade is based on machine learning and data mining. The entry of smartwatches and ﬁtness bands to the consumer market has provided companies with large quantities of data. This data is being used in bulk to iterativ ely train and improv e machine learning algorithms. These algorithms need little or no human intervention and are capable of providing reasonable results in out-of-the-lab conditions. The price that is paid for this automation is the loss of insight on the mean- ingfulness of the parameters that the algorithm is learning. Moreov er , the classes that the system learns may not always correspond to actual different activities due to statistical aberrations derived from the curse of dimensionality . In the present study , we describe and e valuate a hybrid approach that is data-driv en in nature but contains key trainable submodules. The ov erall system is designed for real-time e xecution and it is implemented in commercial low- power SoC (nRF51 and nRF52). In the following section, we describe the sensors and protocols that were used to acquire the necessary data to train the physical activity classiﬁer . Then, we proceed with a description of the structure and implementation of the algorithm. Finally , we conclude with a benchmarking of the system in terms of accuracy , memory footprint, and power consumption on both platforms. I I . M A T E R I A L S A. Sensors A smart wrist-band integrating a three-axial accelerometer and enough memory to store a full night of raw data was dev eloped by PulseOn 1 and used to acquire raw inertial signals. The sensor integrated in the device is an ST’ s LIS3DSH and it provides three axial accelerations in the ± 8 g range. Signals are digitized ov er 12 bits at a sampling frequency of 25 Hz. B. Data acquisition The acceleration forces from the wrist-located sensors were recorded on 140 individuals (76 male, 64 female) in 18 recording campaigns. The data collection was conducted between 2014 and 2017 in T ampere (Finland), Espoo (Fin- land), and Neuch ˆ atel (Switzerland) 2 and it was structured in several databases depending on the type of performed activity/protocol: • ADL Y : Free ofﬁce work • BOUT : Mountain-biking at a variable cadence • FVO2: Running outdoors between 30 and 60 minutes at irregular pace • L V O2: In-lab timed protocol (40 min) containing sitting, and walking and running on a treadmill at increasing speeds • MBD Y : Random daily acti vities such as ofﬁce work, driving, ha ving lunch, etc. • MBOT : W alking, running, and cycling outdoors • MDL Y : Random daily activities • MFOT : W alking, running, c ycling, and skiing at com- fortable intensity • MINT : Random gym activities followed by ex ercise on a treadmill and cycle ergometer • MLAB: In-lab timed protocol (40 min) containing sit- ting, walking and running on a treadmill, and cycle ergometry • MOUT : Running outdoors between 30 and 60 minutes and 4 sets of outdoor cycling • MOUTXC: Cross-country skiing • MSLP: Overnight sleep sessions • MV AL: In-lab timed protocol (40 min) containing sit- ting, walking and running on a treadmill, and cycle ergometry • SDL Y : Random daily acti vities including ofﬁce work and housekeeping • SLAB: In-lab timed protocol (40 min) containing sit- ting, walking and running on a treadmill, cycle ergom- etry , and push-ups • SPR T : Indoor walking, running, cycle ergometry • SPTEST : Indoor walking, running, cycle ergometry , and ofﬁce work A comprehensiv e summary of the content of each database is shown in T able I. A total number of 418 recordings spanning more than 440 hours of raw data was gathered. It is important to notice that some of the 140 test subjects participated in more than one database. 1 http://pulseon.com/ 2 The experimental procedures described in this paper complied with the principles of Helsinki Declaration of 1975, as revised in 2000. All subjects gav e informed consent to participate and they had a right to withdraw from the study at any time. Their information was anonymized prior the analysis. Set Recordings Duration Subjects Age ID (#) (h) (#male/#female) (yrs) MOUT 33 21.32 13m/2f 33.5 ± 10.3 MV AL 21 13.49 11m/10f 28.3 ± 5.69 MLAB 24 16 17m/7f 24.91 ± 3.09 FV O2 24 18 13m/11f 36.1 ± 8.0 L VO2 24 16 13m/11f 36.1 ± 8.0 MSLP 15 79.46 13m/2f 35.9 ± 10.3 MDL Y 12 17 3m/0f 32.3 ± 9.1 MINT 9 5.99 5m/2f 35.57 ± 11.13 SDL Y 59 67.7 15m/13f 27.21 ± 6.77 SLAB 47 42.9 21m/20f 26.40 ± 3.27 SPR T 8 3.52 4m/1f 32.20 ± 6.14 BOUT 11 6.97 6m/1f 28.66 ± 6.34 ADL Y 18 21.9 3m/0f 29.66 ± 2.08 SPTEST 40 18 5m/5f 46.20 ± 11.81 MBD Y 13 25.15 5m/1f 30.60 ± 9.73 MBO T 16 13.6 5m/0f 41.25 ± 15.80 MFO T 15 8.7 4m/3f 26.50 ± 4.23 MOUTXC 29 44.87 6m/3f 39.33 ± 14.35 T otal 418 440.57 76m/64f 29.4 ± 8.58 T ABLE I S U MM A RY O F T H E C O NT E N T O F E AC H DAT A BA S E . Sev eral experts annotated the data choosing among one of the following labels: Rest , W alk , Run , Bike , Of ﬁce , XC skiing , Gym , House work . After consolidation, 340.5 hours from the total of 440.57 hours were annotated. The remaining intervals corresponded to unclear activities or fuzzy transition zones. C. Algorithm structur e The algorithm takes as input the raw accelerometer signals at 25 Hz and outputs the most likely activity among Rest , W alk , Run , Bike , or Other . The structure is iterative and operates on a sample-by-sample basis, meaning that ev ery new sample from the accelerometer produces a new estimate of the most likely undergoing activity . Internally , the algorithm is composed of four clearly sep- arated parts (see Figure 1). In the ﬁrst part, sev eral features including signal power , rhythmicity , and frequency stability are extracted from the accelerometer signals. These features are used as predictors in a binary classiﬁcation tree of depth sev en in the second part. Each node of the classiﬁcation tree contains a dif ferent likelihood for each activity . Then, a ﬁlter-bank of autoregressi ve ﬁlters of ﬁrst order (one ﬁlter for each activity) is applied independently on each activity . These ﬁlters keep temporal consistency across time and their output operates as the set of a-posteriori probabilities for each acti vity . Finally , the activity with highest probability is selected as output provided that the probability is above a certain threshold; otherwise, Other is selected as output. D. Learning classiﬁcation graph The proposed physical activity classiﬁer was trained with a subset of the recordings described in Section II-B. More precisely , 31 recordings were randomly selected, and con- tained multiple acti vities: outdoor walking and running, road- biking, mountain-biking, indoor cycling, treadmill walking and running, and sleeping. The clear parts of the record- ings were manually annotated by sev eral experts follo wing record logs resulting in a total of 1386411 labelled samples Feature extraction Acc (25Hz) A-posteriori probabiliy estimator MAP estimator Activity Learned tree Fig. 1. Schematic structure of the embedded algorithm. ( > 18 hours). Among them, 29.1% were labeled as Rest , 30% were labeled as W alk , 17.5% were labeled as Run , 23% were labeled as Bike , and 0.5% were labeled as Other . The corresponding extracted features were used as predictors to train the binary classiﬁcation tree using the Gini’ s div ersity index as splitting criterion. E. Embedded implementation The classiﬁcation algorithm was implemented for two different embedded platforms: the Nordic Semiconductor’ s nRF52832 and the nRF51822 from the same manufacturer . Each implementation corresponds to a different level of abstraction in now adays commercial embedded systems and require dedicated instructions in order to take advantage of each platform’ s strengths and deal with its limitations. The Nordic Semiconductor’ s nRF52832 SoC incorporates a microprocessor ARM R  Cortex R  -M4F . This SoC performs 32-bit integer arithmetic and includes a ﬂoating point unit (FPU) with single precision and IEEE 754 compliant. On the other side, the Nordic Semiconductor’ s nRF51822 SoC in- corporates a microprocessor ARM R  Cortex R  -M0. This SoC performs a restricted 32-bit integer arithmetic (excluding 32- bit divisions) and does not include an FPU. The implementation on the nRF52832 was performed in C using the cmsis library 3 for mathematical operations and following a restricted set of the MISRA-C:2012 guidelines. Like wise, the implementation on the nRF51822 was per- formed in C using ﬁxed-point arithmetic avoiding, when possible, 32-bit divisions and following a restricted set of the MISRA-C:2012 guidelines. I I I . R E S U L T S A N D D I S C U S S I O N The behavior of the algorithm is best illustrated through an example. In Figure 2, we sho w the process from the raw acceleration signals to a real-time estimation of the physical, going through the instantaneous likelihood of each activity class. This particular example is extracted from the database L V O2 deﬁned in Section II-B. The raw data is shown in the 3 https://dev eloper .arm.com/embedded/cmsis topmost subﬁgure, where the three stages of the protocol can clearly be seen: resting, walking, and running at an increasing speed. Then, the a-posteriori probabilities are shown for each class. It is worth to notice that the a-posteriori probabilities ramp up or do wn follo wing an exponential curve deﬁned by the autoregressi ve ﬁlters of ﬁrst order discussed in Section II- C. Finally , in the bottommost subﬁgure, the ﬁnal estimate of the probability is shown. In this example, most of the segments ha ve one clearly dominant class, except the initial moments in the transition between rest and slo w walking, where several acti vities compete to be the most likely . Fig. 2. Example of the intermediate steps of the algorithm through a dataset from L VO2. (top) raw acceleration signals, (mid) instantaneous class probabilities, (bottom) class estimate. A. Accuracy In order to ev aluate the accuracy of the system, a total of 340.5 hours were labeled from the databases. In T able II, we show the normalized confusion matrix for the labeled data. The columns represent the estimated class from the classiﬁer and the rows represent the actual labelled classes. In bold we have marked the classes that there is one-to-one correspondence between the estimated and actual classes. The three main classes, Rest , W alk , and Run , hav e great classiﬁcation accuracy and recall ranging from 94.7% to 98.9% correctly classiﬁed samples. Howe ver , biking is equally distributed between Rest , Other , W alk , Bike . This stems from the fact that there exist several styles of biking that the classiﬁer did not take into account. For instance, Rest Other W alk Run Bike Duration (%) (%) (%) (%) (%) (hours) Resting 96.3 2.80 0.90 0.00 0.16 86.3 W alking 1.51 2.68 94.66 0.61 0.53 47.9 Running 0.05 0.15 0.91 98.88 0.00 61.8 Biking 27.63 25.31 24.74 0.02 22.29 27.4 Ofﬁce working 81.18 13.38 4.29 0.18 0.96 24.3 XC skiing 0.60 24.56 53.69 18.74 2.40 33.3 Gym 8.25 13.08 53.57 16.6 8.47 2.90 Housew ork 37.82 29.66 25.24 0.05 7.20 56.5 T ABLE II C L AS S I FI CATI O N A C CU R AC Y O F T H E L E AR N E D A CT I V IT Y C L A SS I FI ER . road-biking without pedaling is usually classiﬁed as Rest , and mountain-biking where the user is not seated is usually classiﬁed as W alk . This indicates that biking is a multimodal ex ercise and it would be worthwhile splitting it into more coherent sub-styles. On the other side, it is interesting to see how the classiﬁer generalizes to activities it has not been trained for . Ofﬁce working mainly consists on Rest and a bit of Other (mainly while typing on a keyboard); XC skiing is a combination of W alk and Other ; Gym is mainly W alk followed by Run and Other ; and ﬁnally , Housework is equally distributed between Rest , Other , and W alk . B. Computational load For the purpose of measuring the computational load of the different implementations of the algorithm, a standard dataset was deﬁned. This dataset contained 1 minute of raw accelerometer data and was fed to the algorithm cyclically 1000 times. The average ex ecution time was then av eraged across all iterations in order to obtain the av erage execution time per acceleration sample. The test dataset was composed of 58% of resting, 12.5% of walking, 4.2% of running, and 25.3% of other . For the SoC nRF52832 and nRF51832, the whole library with O3 optimizations with gcc (GNU toolchain from ARM Cortex-M and Cortex-R processors) version 6.0 using Nordic’ s SDK 12.1. The execution time was then con verted to average drained current by using the data contained in the respective SoC’ s datasheets. A summary of the computational complexity and current con- sumption is detailed in T able III. Platform Execution time Current Complexity (ms) (uA) nRF52832 0.067 15.3 107 KFLOPS nRF51832 0.458 50.4 183 KIPS T ABLE III C O MP U TA T I ON A L L OA D O N E AC H P L A T F O R M . C. Memory footprint The memory requirements for the nRF52832 and nRF51832 were measured by compiling the whole library with O3 optimizations with gcc (GNU toolchain from ARM Cortex-M and Cortex-R processors) version 6.0 using Nordic’ s SDK 12.1. The listing (MAP) ﬁle produced by the linker was used to generate a complete list of all variables, constants, and functions used. A summary of the memory requirements is detailed in T able IV. Platform RAM Flash (Kbytes) (Kbytes) nRF52832 2 7.9 nRF51832 2 8.6 T ABLE IV M E MO RY F O OT PR I N T O F E A CH I M P LE M E N T ATI O N O F T H E A L G OR I T HM . These numbers reﬂect the small memory footprint con- sumption of the algorithms. An interesting phenomenon that can be observed in T able IV is that the size of the code is smaller in the nRF52832 than the nRF51832. That is due to the fact that when an FPU is present, the algorithm can be written in a more compact way than the in ﬁxed-point arithmetic, generating less low-lev el instructions. I V . C O N C L U S I O N Based on the presented results, we conclude that the pro- posed approach for learning an acti vity classiﬁer is capable of generating an algorithm that can be integrated in a real-time embedded system while keeping a high precision and recall for the most common activities ( Rest , W alk , and Run ). The algorithm generalizes properly to other sports such as XC skiing or Gym, and daily acti vities such as Of ﬁce working or Housew ork. The power consumption and the memory needed represent only a minimal fraction of the overall ﬁrmware, and opens the door to a future ultra-low power ASIC implementation. R E F E R E N C E S [1] J. V anW ormer, “Pedometers and brief e-counseling: Increasing physical activity for ov erweight adults, ” Journal of Applied Behavavior Analysis , vol. 37, no. 3, pp. 421–425, September 2004. [2] G. Le Masurier , C. Sidman, and C. Corbin, “ Accumulating 10,000 steps: Does this meet current physical activity guidelines?” Res. Q. Exer c. Sport , vol. 74, no. 4, pp. 389–394, December 2003. [3] R. Delgado-Gonzalo, P . Renevey , A. Lemkaddem, M. Lemay , J. Sol, I. Korhonen, and M. Bertschi, Seamless Healthcare Monitoring - Ad- vancements in W earable, Attachable , and Invisible Devices - . Springer, 2017, ch. Physical Activity . [4] M. Mathie, A. Coster , N. Lovell, and B. Celler, “ Accelerometry: Providing an integrated, practical method for long-term, ambulatory monitoring of human movement, ” Physiol. Meas. , vol. 25, no. 2, pp. R1–R20, February 2004. [5] P . Prkk, M. Ermes, P . Korpip, J. Mntyjrvi, J. Peltola, and I. Korhonen, “ Activity classiﬁcation using realistic data from wearable sensors, ” IEEE T rans. Inf. T echnol. Biomed. , vol. 10, no. 1, pp. 119–128, January 2006. [6] R. Delgado-Gonzalo, P . Celka, P . Rene vey , S. Dasen, J. Sol, M. Bertschi, and M. Lemay , “Physical activity proﬁling: Activity-speciﬁc step count- ing and energy expenditure models using 3d wrist acceleration, ” in Pr oceedings of the EMBC’15 , Milano, Italy , August 25-29 2015. [7] M. Ermes, J. Prkk, and I. Mntyjrvi, “Detection of daily activities and sports with wearable sensors in controlled and uncontrolled conditions, ” IEEE T rans. Inf. T echnol. Biomed. , vol. 12, no. 1, pp. 20–26, January 2008. [8] R. Delgado-Gonzalo, A. Lemkaddem, P . Reneve y , E. Calvo, M. Lemay , K. Cox, D. Ashby , J. W illardson, and M. Bertschi, “Real-time moni- toring of swimming performance, ” in Proceedings of the EMBC’16 , Orlando, Florida, USA, August 16-20 2016.

Learning a Physical Activity Classifier for a Low-power Embedded Wrist-located Device

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment