14 pages

Adapting to Node Failure In Sensor Network Query Processing

of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Adapting to Node Failure In Sensor Network Query Processing
  Adapting to Node Failure In Sensor NetworkQuery Processing Alan B. Stokes, Alvaro A.A Fernandes, Norman W. Paton School of Computer Science, University of Manchester,Manchester M13 9PL, United Kingdom Abstract.  The typical nodes used in mote-level wireless sensor networks(WSNs) are often brittle and severely resource-constrained. In particu-lar, nodes are often battery-powered, thereby making energy depletiona significant risk. When changes to the connectivity graph occur as aresult of node failure, the overall computation may collapse unless it iscapable of adapting to the new WSN state. Sensor network query pro-cessors (SNQPs) construe a WSN as a distributed, continuous queryplatform where the streams of sensed values constitute the logical ex-tents of interest. Crucially, in the context of this paper, they must makeassumptions about the connectivity graph of the WSN at compile timethat are likely not to hold for the lifetime of the compiled query evalua-tion plans (QEPs) the SNQPs generate. This paper address the problemof ensuring that a QEP continues to execute even if some nodes fail. Thegoal is to extend the lifetime of the QEP, i.e., the period during whichit produces results, beyond the point where node failures start to occur.We contribute descriptions of two different approaches that have beenimplemented in an existing SNQP and present experimental results in-dicating that each significantly increases the overall lifetime of a querycompared with non adaptive approach. Keywords:  Sensor Network Query Processors, Wireless Sensor Networks, Re-silience 1 Introduction Wireless sensor networks (WSNs) are useful in data collection, event detectionand entity tracking applications, among others. In particular, mote-level WSNsare sufficiently inexpensive that one can envisage deploying them to sense at finegranularities both over space and over time. With the low cost, however, comesevere resource constraints in terms of energy stock, communication range, com-putational and storage capabilities, etc. Our focus here is on WSNs comprisingstatic motes of this kind (e.g., [1]).If one views the WSN as simply an instrument for data collection, one mighttask the relevant subset of nodes to sense the physical world and send the sensedvalues, using multi-hop communication paths, towards a base station where allthe processing takes place. However, sending all data in raw form to the basestation causes more bytes to be transmitted than would be the case if the nodesalong the route to the base station were tasked with some of the processing [10].  Since the energy cost of processing data is typically an order of magnitude smallerthan the energy cost of transmitting the same data [6], it is more energy-efficientto do as much processing as possible inside the WSN, as this is likely to reducethe number of bytes that are transmitted to the base station.One approach to in-WSN processing construes the WSN as a distributeddatabase, and the processing task injected into nodes for execution is the evalu-ation of a query evaluation plan (QEP). In this approach, users specify their datarequirements in the form of declarative queries, which the system, called a sensornetwork query processor (SNQP), compiles into optimized QEPs for injectioninto the WSN. Through periodic evaluation, a stream of results is returned tothe users via the base station.Many SNQPs have been proposed in the literature, e.g.  SNEE  [4], TinyDB [9],and AnduIN [7]. These SNQPs often differ in terms, among others, of how muchof the required query functionality can be injected into the WSN, how much usethey make of distributed query processing techniques (e.g., fragment partition-ing, buffering tuples for block transmission, etc.), and how much compile-timeknowledge of the WSN state they require in order to produce a QEP. Thus, An-duIN does not inject joins for in-network execution, only QEP leaves, i.e., sensingtasks. AnduIN uses a TCP/IP protocol stack and therefore has no need to knowthe state of the WSN connectivity graph at compile time. The benefit ensuingfrom this approach comes at the price of potentially greater energy expenditure(since TCP and IP were not designed with the goal of preserving energy) and of reduced capability to inject functionality into the network (since part of the veryscarce program memory has to be assigned to the protocol stack). In contrast,TinyDB is capable of performing limited forms of joins inside the WSN butpushes the entire QEP to every participating node. Again, this is profligate withrespect to program memory. TinyDB infers the current connectivity graph fromthe dissemination of the QEP into the WSN. Finally,  SNEE , which we focus on inthis paper, pushes very expressive QEPs into the WSN whilst still partitioningthe latter into fragments that are as small as possible for each node. However, SNEE  neither uses a generic protocol stack nor can it compile the QEP withoutknowledge of the current connectivity graph. SNEE  does more in-WSN processing than the other SNQPs mentioned above.It generates QEPs that deliver good energy efficiency [4] whilst scheduling fornode execution QEP fragment instances that use less memory (partly by notusing, and hence not loading, generic protocol stacks) [4] than the other SNQPsmentioned above. To generate QEPs where medium access, routing, and trans-port are query-specific, the  SNEE  compiler takes as input (among other meta-data) the current connectivity graph. This implies a further, and stronger, as-sumption, viz., that if the connectivity graph changes (e.g., a node fails) duringthe lifetime of QEP  p , then  p  may not be optimal for the new topology (and,indeed,  p  may even be unable to go on running). In other words, maximizingQEP lifetime is dependent on resilience to failure. A  SNEE  QEP often has itslifetime bounded by the time-to-failure of participating nodes. In practice, notonly is node failure assumed to be a common occurrence, the energy stock of   participating motes is guaranteed to diminish over time and depletion eventuallycauses the motes to become non-functional.SNEE QEPs are therefore particularly brittle: if a participating node fails,poor performance, or even a crash, could ensue. One aspect of poor performanceis the lack of adaptation to tuple loss when the corresponding extent drawsfrom a failed node. Such failures lead to partial results for the query. It is,therefore, desirable that, if possible, the QEP is adapted in response to nodefailure. Another possibility is that the failed node causes the communicationgraph used by the QEP to partition in such a way that, although all sensedvalues are flowing out of the leaves, they cannot be used as they fail to reachsome downstream operators, i.e., the energy expenditure of the leaves is wasted.Adaptations aim to minimize information loss and foster compliance withquality of service (QoS) expectations such as maximum delivery rate and con-stant acquisition rate.The purpose of adaptations, in the case of this paper, is to maximize thelifetime of the QEP. Since lifetime is influenced by the rate of depletion of energystocks and since any adaptation will cause some such depletion (i.e., carries anenergy overhead cost), adaptations must take into account the time taken toadapt (during which, data will cease to flow) and the energy spent in carryingout the adaptation. Our hypothesis is that the benefit of adapting with a viewto significantly increase the QEP lifetime (and, therefore, the amount of dataproduced) outweighs the cost incurred in adapting.We compare two strategies that at runtime adapt the QEP in different waysto increase resilience to failure. The first strategy acts as a baseline: it recomputesthe entire QEP for the new deployment state, and disseminates the recomputedQEP. This acts like a reset and has high overhead because both the disseminationof a QEP and its writing onto program memory are costly in terms of time andenergy expenditure. In contrast, the second strategy identifies adaptations thatrequire the minimal amount of changes to repair the QEP.The results show that the adaptation costs incurred by the both strategiescan lead to significant increases in the lifetime of a QEP.The rest of the paper is as follows Sec. 2 briefly describes related work. Sec. 3describes the  SNEE  SNQP and how the time and energy costs of a QEP aremodelled. Sec. 4 describes each strategy. Sec. 5 describes how we experimentallyevaluated each one. Sec. 6 draws conclusions. 2 Related Work Broadly speaking, current SNQPs are not very tolerant of node failure. In TinyDB,the fact that routing trees [9] are constructed during the QEP dissemination pro-cess provides some amount of inter-query fault tolerance, as failed nodes do nottake part in disseminating the next QEP (which could be a recompilation of thesame query) and hence will be disqualified from participating in its evaluation.Also, each node in TinyDB evaluates the entire QEP (i.e., TinyDB makes noattempt to partition the plan into fragments), and, as a result, ignoring a failednode is a sound strategy. Thus, whilst TinyDB is not, strictly speaking, adaptive,it is, to a certain degree, resilient to some forms of node failure. However, tuple  transmission is not globally scheduled (as it is in  SNEE ), so there is no way of estimating how many tuples might be lost as a result of failed nodes.The SmartCIS project [8] builds upon TinyDB with a particular goal (amongothers) of supporting fault-tolerant routing trees via multi-path transmissions.This approach incurs energy overheads in verifying that current paths are correctand in searching for new correct ones.AnduIN has no specific mechanism for fault tolerance. In contrast with bothTinyDB and  SNEE , which compile into TinyOS [5], AnduIN compiles into Con-tiki [3]. The difference is relevant in our context because, unlike TinyOS, Contikiprovides a TCP/IP-based communication protocol stack. Thus, AnduIN benefitsfrom the robust routing and transport properties built into TCP/IP. The draw-back is that TCP/IP incur much greater overheads (and take up more memoryfootprint) than the minimalistic, query-specific protocols used by TinyDB and SNEE . Some of these overheads stem from the need to maintain up-to-date con-nectivity paths as well as from the need to send acknowledgement packets. As tomemory occupancy, TCP/IP implementations will take up space and will alsoclaim more memory for such structures as routing tables. By reducing the mem-ory on the nodes that can be allocated to the QEP, there is a reduction in howmuch processing can be shipped to the WSN and how much memory can beused buffering and blocked transmission, both features that are energy-saving.AnduIN does not adapt to failure of acquisition nodes.Our prior work [11] explored logical overlays which use redundant nodes(fortuitous or planned) within the network to achieve resilience to node failureusing clusters of equivalent nodes. 3 Technical Context SNEE  aims to generate energy-efficient QEPs. The compilation/optimizationprocess takes as input a  SNEEql  query (as exemplified in Fig. 1), QoS expecta-tions (not shown in the figure) in the form of a desired acquisition rate (i.e., thefrequency at which sensing takes place) and a maximum delivery time (i.e., anupper bound on the acceptable amount of time between data being acquired andbeing reflected in the emitted results), and the following kinds of metadata: (1)the current connectivity graph, which describes the (cost-assigned) communica-tion edges in the WSN; (2) the logical schema for the query, which describes theavailable logical extents over the sensing modalities in the WSN; (3) the physicalschema for the query, which describes which physical nodes contribute data towhich logical extent, and which node acts as base station; (4) statistics aboutnodes (e.g., available memory and energy stocks); and (5) cost-model parame-ters (e.g., unit costs for sleeping, sensing, processing, and communicating) [2].The example query takes two streams, one stemming from sensors in a field, theother from sensors in a forest. It joins them on the condition that light levelsare higher in the field than in the forest and emits onto the output stream thematching values and the ids of the nodes that generated them. The intuitionbehind the query is that if light levels in the forest are higher than in the openfield, then one might suspect that a forest fire has started.  Fig. 2 shows the  SNEE  (compilation/optimization) stack. As a distributedquery optimizer, it uses a two-phase approach. The single-site phase (Steps 1-3in Fig. 2) comprises the classical steps needed to compile and optimize a queryfor centralized execution. The outcome is the physical-algebraic form (PAF) forthe query, where each operator has been given its execution order and assigneda concrete algorithm. The multi-site phase (Steps 4-7 in Fig. 2) turns the PAFinto a distributed algebraic form (DAF) for the query by making decisions thatare specific to in-WSN execution. These include deciding on a routing tree  R , onfragment instance allocation along the routing tree captured as a DAF  D  andon timing the activities in the nodes (switching from QEP fragment evaluationto communication and so on) in the form of an agenda  A . A final step convertsthe triple   R,D,A   into a set of per-node nesC/TinyOS source files, which arethen compiled into binary form. This is what we refer to as the executable QEP.In more detail, Step 4 in Fig. 2 generates a routing tree (RT) for the queryas an approximation of a Steiner tree, e.g., the one in Fig. 3(a) for our examplequery. Each vertex is a sensor node; an edge denotes that the two nodes can com-municate; the arrow denotes the direction of communication; double-line circlesdenote the sink or else nodes that do sensing; single-line nodes do processing orcommunication or both. Recall that a Steiner tree is a minimum spanning tree(and hence likely to be energy-efficient) that necessarily includes a given set of nodes. In our case, these are the leaves (i.e., the acquisition nodes) and the root(i.e., the base station).Step 5 in Fig. 2 decides which fragment instances to place for execution inwhich node. This partitions the PAF into fragment instances and assigns thelatter to RT nodes with a view to conserving energy by reducing the number of tuples that need to be transmitted. The resulting DAF for the example query isshown in Fig. 3(b). Dashed boxes define fragment boundaries; the list in curlybrackets at the bottom-right corner (below the fragment identifier) denotes howmany instances of that fragment there are and in which nodes they run. Thefragment containing the deliver operator runs on the sink node, the fragmentinstances containing the acquisition operators run on the leaf nodes and the re-maining fragment instances are assigned to run on Node 1 because it is, amongstthe nodes through which all tuple streams required for the join flow, the hop-count closest to the leaves. We call such nodes,  confluence nodes  . Logical Schema: field (id, time, temp, light); forest (id, time, temp, light);Physical Schema: field: {N6, N9}; forest: {N7}; sink: {N8}Q: SELECT RSTREAM, c.light,, f.light FROM field[NOW] c, forest[NOW] fWHERE c.light < f.light Fig.1.  Example Query, Logical/Physical Schemas
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks