Instruction manuals

9 pages
6 views

A knowledge-based approach to interactive workflow composition

of 9
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
ABSTRACT Complex applications in many areas, including scientific computations and business-related web services, are created from collections of components to form computational workflows. In many cases end users have requirements and preferences
Transcript
  A Knowledge-Based Approach to Interactive WorkflowComposition Jihie Kim, Yolanda Gil, Marc Spraragen University of Southern California/Information Sciences Institute   Marina del Rey, CA 90292 USA  +1 310 448 8769   {jihie, gil, marcs}@isi.edu   ABSTRACT Complex applications in many areas, including scientificcomputations and business-related web services, arecreated from collections of components to formcomputational workflows. In many cases end users haverequirements and preferences that depend on how theworkflow unfolds, and that cannot be specified beforehand. Workflow editors therefore need to beaugmented with intelligent assistance in order to helpusers in several key aspects of the task, namely: 1)keeping track of detailed constraints across selectedcomponents and their connections; 2) accommodatingflexibly different strategies to construct workflows; e.g.,from general knowledge of necessary tasks, from desiredresults, or from available data; and 3) taking partial or incomplete descriptions of workflows and understandingthe steps needed for their completion. We have developeda system called CAT (Composition Analysis Tool) thatanalyzes workflows and generates error messages andsuggestions in order to help users compose complete andconsistent workflows. Our approach combinesknowledge bases, which have rich representations of components and constraints, together with planningtechniques that can track the relations and constraintsamong individual components. We have formalized our approach based on AI planning principles, allowing us toformulate claims about the underlying algorithms as wellas the resulting workflows. Keywords workflow composition, description logic, interactiveapproach INTRODUCTION Composing computational workflows is essential in manyareas, including scientific computing and businessapplications. For example, scientists have growing needsto dynamically produce computation workflows wherethey assemble and link various models that addressdifferent aspects of the phenomenon under study[Griphyn 03, SCEC 03, Geodise 03, MyGrid 03]. In business applications, web services are becoming a promising framework for composing new applications outof existing software components (such as softwaremodules or web services). Some planning approacheshave been used in this context. [Lansky et al 95, Chien etal 96] However, automatic planning approaches are notalways appropriate to generate these workflows. In somecases, users may not have explicit descriptions of thedesired end results or goals in the beginning. Users mayonly have high-level or partial/incomplete descriptions of the desired outcome or the initial state, and the real goalsand initial data input may become clear as they see thefeatures of the components that can be used. Businessagreements and past experiences of how the componentswork may also affect the development of the workflow.The goal of our work is to develop interactive tools for composing workflows where users select and configurecomponents and the system assists the users by detectingerrors and providing intelligent suggestions to solve them.This provides a complementary capability needed for developing computational workflows.This paper presents an approach to interactive workflowconstruction. Novel features of the approach include: 1)combining description logic and planning frameworks; 2)interactive workflow construction through planningtechniques; 3) properties for verification of manuallycreated workflows based on planning techniques; and 4)an algorithm that assists users by enforcing those properties and suggesting possible next steps.Using this approach, we have developed the CAT systemto analyze a partial workflow composed by the user,notify the user of issues to be resolved in the currentworkflow, and suggest to the user what actions could betaken next. In this paper we focus on the planningtechniques that are used in our framework. The details onour system’s interfaces and user interactions are describedin [Kim et al 04].The paper begins by describing our motivations and goals based on a scientific application (earthquake simulation)which led us to develop CAT. We outline therepresentation that we have developed to describeworkflow components. We define desirable properties of workflows, and the composition actions that the user canemploy to refine the workflow. We then present the  ErrorScan algorithm that analyzes a partial workflow, andgenerates suggestions that will lead the user to composean error-free workflow. Finally, we present CAT’scontributions in context of related work.An important aspect of our approach is verifyingworkflows by detecting errors and missing steps. In this paper, we focus on the use of these verification techniquesin an interactive setting. However, workflow verificationis useful in other contexts. For example, users maycompose workflows using editors off-line, then invoke aworkflow verification to report problems with theworkflow. This is typical in scientific environmentstoday, where scientists use text editors to createworkflows. There are also many workflow editors that provide useful graphical capabilities [SmartDraw 03,Khoros 03, BizTalk 03] but have no comprehensive error checking facilities. Other composition tools that usedomain knowledge provide limited help in guiding theusers during the interaction because they don't explicitlyuse the semantics associated with steps and links [Chen etal 03, Sirin et al 03]. Our workflow verificationtechniques would be a useful addition to these workfloweditors. Another context in which workflow verificationwould be beneficial is reuse, adaptation, and merging of  previously existing workflows. In scientific applications,retrieval of past successful workflows as a starting pointto design new ones is commonplace. Our workflowverification techniques can help a scientist during the process of adapting these workflows to the new situations.Finally, workflow verification techniques would be usefulin assisting users to develop end-to-end applications bymerging previously existing workflows that addresssmaller aspects of the overall application. In mergingworkflows, many inconsistencies, gaps, and overlaps mayoccur. Ultimately, user-guided workflow compositionwould involve not only interactive development but alsothe abovementioned modalities of one-shot editing,retrieval and adaptation, and merging of existingworkflows. Motivation Figure 1. An example computational workflow in theearthquake science domain.Scientific progress is significantly accelerated byintegrating components that model different aspects of the phenomena being studied. Projects investigating large-scale scientific applications include [SCEC 03, GriPhyN03,Geodise 03, MyGrid 03]. We take an example fromour collaboration with SCEC (Southern CaliforniaEarthquake Center); similar issues arise in other scientificapplications.Figure 1 shows an example of a completed workflow for from seismic hazard analysis (SHA) to enable buildingengineers to estimate the impact of potential earthquakesat a construction site and on their building designs.Scientists have developed many models that can be usedto simulate various aspects of an earthquake: the ruptureof a fault and the ground shaking that follows, the shapeof the wave as it propagates through different kinds of soil, the vibration effects on a building structure, etc. Themodels are complex, heterogeneous, and come with manyconstraints on their parameters and their use with other models.Another use of workflow verification is to enableintegration of interactive and automatic techniques for workflow development. After a user sketches a workflow,an automated planner could fill in the details and missingsteps. However, in order for this to work it is necessary toensure that errors in the workflow created by a user, suchas redundant steps or inconsistent links, are removed before an automated planner attempts to complete theworkflow. We look at this aspect in detail later in the paper.In their work, scientists and engineers often want tosketch the workflows themselves, influencing the choicesof the software components and the links between them.In many cases end users have requirements and preferences that often depend on how the workflowunfolds and cannot be specified beforehand. For example, users may not know which wave propagationmodel is appropriate until the distance from the faultrupture to the location is determined. APPROACH Figure 2 below shows the CAT interface that we built tosupport interactive workflow composition. The CATsystem has been used to support developingcomputational workflows in earthquake science domainlike the one for seismic hazard analysis that is shown inFigure 1.   Figure 2. The CAT interface for composing workflows for seismic hazard analysis.In our approach as implemented in CAT, users form aworkflow incrementally, and the system checks thevalidity of the workflow and suggests what to do next. Inthe process of constructing a workflow, users can add acomponent to the workflow and make links between thecomponents. When adding a component, the user mayindicate an abstract type of component that will be further specialized at a later step. For example, the user mayspecify that the workflow will include a fault rupturemodel, and not decide which model to use until the wave propagation model is selected. The user may start fromthe end results, or from a set of known data, or from thecomponents they want to include.The analysis of partial workflows created by the user isdone using an AI planning framework [Weld 99]. Eachcomponent is treated as a step in the plan, the inputs of acomponent are the preconditions of that step, and thecomponent’s outputs are the step’s effects. The links between components are treated as causal links; any data provided by the user form the initial state, and the desiredend results are the goals for the planning problem. Eachaction taken by the user (add/remove component,specialize component, add/remove link) is akin to arefinement operator in plan generation. While automatic planning systems can explore the space of planssystematically and guarantee that the final plans arecorrect, interactive workflow composition requires anapproach that lets the user decide what parts of the searchspace to explore and that can handle incorrect partialworkflows.The next subsection describes how we representcomponents, abstract types of components, and their constraints. Supporting Knowledge Base Figure 3 shows a portion of a sample CAT KnowledgeBase (KB) for a travel domain. In the KB’s DomainOntology, there is a hierarchy of data types represented inLoom classes [MacGregor 90].   In the ComponentOntology, each component is represented in terms of itsinput and output parameter data types. For example,component Car-Rental-by-Airport needs an airport (asarrival-place) and a Date (as arrival-date) as input, and produces a Car-Reservation. Abstract components mayhave more abstract types of parameters. For example, amore abstract component Car-Rental has an input parameter arrival-place with type Location instead of Airport. Since each parameter of the component isdefined in terms of the data types in the Loom KB, CATcan exploit Loom’s reasoning capabilities and use them inhelping users construct correctly formulated workflows.   Figure 3. An example CAT Knowledge Base.The knowledge base supports the following queries: • components (): returns a set of available components(including abstract ones) defined in the KB • data-types (): returns a set of data types defined in the KB • input-parameters (c): returns input parameters of component c • output-parameters (c): returns output parameters of component c • executable (c): returns false iff c is not an executablecomponent. • range (c, p): returns a class defined (or derived) as therange of parameter p of c. Here we assume that there isonly one class that represents the range of the given classand the parameter. •  subsumes (t1, t2): returns true iff class t1 subsumes class t2in the KB •  specializations (c[,r,v]): returns subconcepts of c, optionallywhere value for role r is v. • component-with-output-data-type (t): returns a set of components c ∈   components () s.t. ∃ p ∈   output- parameter  (c) and  subsumes (range(c,p),t), where t is a datatype • component-with-input-data-type (t): returns a set of components c ∈   components () s.t. ∃ p ∈   input- parameter  s(c) and  subsumes (t, range(c,p)), where t is a datatype. Examples: • input-parameters (Rent-Car) = {arrival-timearrival-location}. • range (Rent-Car, arrival-time) = time. •  subsumes (Rent-car, Airport-Car-Rental)= true. •  subsumes (airport, location) = false. WORKFLOW A workflow W is a tuple < C, L, I, G > where C  is a set of    workflow components,  L is a set of  links, I  is a set of   Initial-Input  components and G (for Goals)   is a set of   End-Result  components. Initial-Input components andEnd-Results are handled in most respects as any other component. Initial-Input components (user-input data)can be handled as components with one output parameter and no input parameters. End-Result components arecomponents with one input and no outputs.Each link  is a tuple < c o ,p o ,c i ,p i > where p o is an output parameter of a component c o   ∈   C  U I   , and p i is   an input parameter of component c i ∈   C  U G. For example, thelink between flight-arrival-date parameter of Reserve-Flight and arrival-date parameter of Rent-Car in Figure 4 below can be represented as <Reserve-Flight, flight-arrival-date, Rent-Car, arrival-date>. When representing parameters and components in examples, we may usetheir KB names for convenience, as in this case.Figure 4. a simple workflow sketch. Properties of a Workflow In CAT, the workflow composition process is guided by aset of desirable properties. This section first introducessome features of workflow components and defines thosedesirable properties in terms of the component features.Given a workflow < C, L, I, G  > and its componentc i   ∈   C,  p ∈ input-parameters(c i ) is  satisfied  iff  ∃ a link  <    c o ,p o ,c i ,p i > ∈    L s.t. p i = p. That is, an input parameter issatisfied when it is linked to any output parameter of aworkflow component. Otherwise we call the parameter  unsatisfied  . A workflow component is satisfied if all itsinput parameters are satisfied. In Figure 4, both Reserve-Flight and Airport-Car-Rental component are unsatisfied.A Link <c 1 ,p1,c 2 ,p2 > is consistent    iff subsumes(range(c1,p1),range(c2,p2)). Otherwise we call it inconsistent  . In other words, if the link’s destination datatype (input to) subsumes its source type (output from),then the data being supplied to the destination isguaranteed to be useful.Given a workflow < C, L, I, G > workflow component c1 ∈   C  U I  , c1 is remote-linked  to a workflow componentc2 ∈   C    U   G iff ( ∃ link l < c o ,p o ,c i ,p i >   ∈   L  where c o = c1and c i = c2) or ( ∃ component c3 ∈   C  s.t. c1 is remote-linked to c3 and c3 is remote-linked to c2). That is, thereexists a (directional) chain of links that connects c1 to c2in the workflow. In Figure 4, the Initial-Input component  A workflow W< C, L, I, G > is  justified  iff  ∀ c ∈   C  U I  , cis justified. Otherwise, W is unjustified.with output parameter Date-Time is Remote-Linked to theEnd-Result Flight-Res-# via Reserve Flight component.A Link l < c o ,p o ,c i ,p i >   ∈   L  is redundant  iff  ∃ link l2 < c o ’,p o ’,c i ’,p i ’ >   ∈   L  s.t. l ≠ l2 and c o = c o ’ and p o ’ = p o  and c i = c i ’ and p i = p i ’ or if c o and c i are remote-linked.A workflow W< C, L, I, G > is cyclic iff  ∃ c ∈   C  s.t c isremote-linked to c. Otherwise, W is acyclicA workflow W< C, L, I, G > is consistent  iff  ∀ link l ∈   L  , lis consistent. Otherwise, W is inconsistent.Given a workflow < C, L, I, G > its component c ∈   C  is  justified  iff c ∈   G or    ∃ c2 ∈   G where c is remote-linkedto c2. Otherwise, C is unjustified  . Currently, Airport-Car-Rental component in Figure 4 is unjustified.A workflow is correct  if it is purposeful, grounded,acyclic, satisfied, justified, and consistent.Given a workflow, CAT checks the above properties based on the features of the workflow’s components andlinks, and produces a report on the kinds of problems thatmade the workflow not correct. The report also includessuggestions for fixing each error.The following is a list of desirable properties of workflows, based on the elementary properties listedabove.A workflow W< C, L, I, G > is  purposeful  iff  ∃   G   ≠   φ .Otherwise, W is not purposeful. That is, the workflowcontains at least one End-Result component. CAT allowsusers to construct sketches of workflows withoutspecifying desired End-Results, but to complete aworkflow, users need to provide the kinds of outcomethey expect. Workflow Refinement Actions At any time, the user may do the composition actions below: • add a component to the workflow. It may be anabstract component, or a specific (executable) one, or an Initial-Input, or an End-Result.A workflow W< C, L, I, G > is  grounded  iff  ∀ c ∈   C, executable(c) = true. Otherwise, W is ungrounded. To be able to execute a given workflow, all the componentsintroduced to the workflow should be specialized intoexecutable ones. • add a link between two components (or Initial-Input,or End-Result) to indicate that the output of oneshould be the input of another.(The user can also delete anything that can be added.)A workflow W< C, L, I, G > is  satisfied  iff  ∀ c ∈   C  U G , cis satisfied. Otherwise, W is unsatisfied.Action Name Description Possible Fixes Possible New ErrorsAddComponent(w,c)Given workflow w = < C, L, I, G > and c ∈ components(), w becomes < C  U {c},  L,I,G >.  Note: addition of an Initial- Input or End-Result is done in a similar way for set I or G,respectively.  w purposeful c not grounded or  justified, or p ∈ input- parameters(c) notsatisfiedRemoveComponent(w,c)Given a workflow w=< C,L, I, G > and c ∈ C  , w becomes < C - {c}, L, I, G >.  Note: removal of an Initial-Input or End- Result is done in a similar way, as with Add actions . AsRemoveComponent simply removes a component withoutdeleting associated links, generating ‘dangling’ links, wedon’t allow users to use this primitive action . Instead, userscan use RemoveComponentAndLinks, as described below.c grounded or  justified, or p ∈  input- parameters(c)satisfiedw not purposefulAddLink (w, c1,p1,c2,p2)Given a workflow w=< C, L, I, G >, c1 ∈   C  U  I,  p1 ∈  output-parameters (c1), c2 ∈   C  U G, and p2 ∈ input- parameters (c2), w becomes <C,L U {<c1,p1,c2,p2>},  I,G >. p2 satisfied, c1 justified. New link may not beconsistent, cause acycle, or be redundantwith an existing link.RemoveLink (w, l)Given a workflow w=< C, L, I, G >, l ∈    L, w becomes< C, L- {l}, I, G >.l was redundant,not consistent, or cycle-causing.Unsatisfied parameter,unjustified component.Figure 5. Primitive actions and possible resulting workflow properties.Figure 5 details primitive actions in terms of their effectsin the workflow (add and remove actions havecomplementary effects). In addition to these primitiveactions, CAT allows “composite” actions, but currentlyonly within suggested error fixes, in order to make thecomposition process more coherent and efficient. SeeFigure 7 below for summaries of these actions’ effects (intheir role as suggested fixes for particular errors). Each
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x