Preliminary Findings of NSF Workshop on Data-Centric WorkflowMay 3, 2009 Workflow technology is central to the efficient and flexible management of business and government processes, including management of information, resources, personnel, manufacturing, transportation, healthcare delivery, and the analysis of scientific information. In the past few years, a data-centric approach to workflow has emerged, which has the potential of significantly enhancing, and in some cases supplanting, the traditional process-centric approach. This shift towards data-centricity of workflow is occurring along two broad dimensions. First, conceptual models for data-aware workflow are emerging in the business process management arena, and also arise more implicitly in healthcare delivery and digital government. These models elevate the data being manipulated by the workflows to a level of prominence essentially equivalent to the level given to process flow in conventional models. It appears that these models permit a unification of the conceptual models for workflow requirements, policies, activity flows, data management, and monitoring, which have hitherto been rather disparate. Second, the area of workflow as data has been growing in recent years, primarily through the lens of scientific workflow, and also in recent research aimed at business application. A key focus here is to be able to easily represent, store, and query both workflow schemas and enactments (i.e., "runs") of those schemas. This is useful for understanding the provenance or history of how data is produced of updated, for discovering and re-using workflow schemas, and for monitoring compliance of workflows with government or other regulations and policies. The workshop is centered on articulating the underlying tenets of data-centric workflow, identifying key advances already made in the area, and identifying research challenges that should be addressed in the coming years in order to maximize the value that can be gained by these new perspectives on workflow. The workshop will address both research in data-centric workflow from a general perspective, and from the perspectives of application in business, healthcare, digital government, and science. A synopsis of some discussions held by the workshop participants in advance of the workshop is presented here, in order to provide a preliminary indication of the kinds of results anticipated from the workshop.
Data-aware WorkflowCurrent practices in the design, deployment, and maintenance of workﬂows, especially in the realm of business applications, are fundamentally disjointed, because different conceptual models are used for the different aspects of managing business operations. Four key dimensions in business process management are:
Workflow as DataIn a variety of contexts and application areas, it is useful to understand how a workflow is performing its activites, and even manipulate the ways that it is performing them. For example, there are the following needs.
Research Challenges in Data-Centric Workflow, Considered at a general levelWhile some progress has been made in data-centric workflow, this field is still quite young, and many research questions need further study in order to maximize the potential value of this new perspective. Some of the main research themes are now listed. The first themes listed relate primarily to data-aware workflow. W1: Conceptual Models. The field should continue to invent, extend, refine data-aware/data-centric workflow models, and especially those based on a tight coupling of data schema and lifecycle specification. W2: Foundations. Study fundamental properties of these models (e.g., analysis, synthesis, expressive power, views, interaction, and perhaps something analogous to database normal forms). W3: Systems Issues. Study approaches to architecture and implementation of data-aware workflow, including optimization, distribution, security, monitoring and reporting. W4: Enabling Richer Semantics. Understand the implications of using ontologies rather simple data schemas in data-aware workflow models; incorporate techniques from semantic web services for auto-discovery, auto- composition, auto-monitoring; develop a much deeper understanding of the semantics underlying OMG's Semantics for Business Vocabulary and Rules (SBVR). W5: Ecosystem Enablers. In the context of data-ware workflow: enable workflow schema design, evolution, and variations; manage workflow evolution in the context of "in-flight" enactments; manage complex events; incorporate people and performers in the spirit of BPEL4People; incorporate security; and explore whether data-aware enables new approaches to managing exceptions. Some themes relevant primarily to workflow as data are now listed. W6: Querying of workflow schemas and enactments. Continue with paradigms/frameworks/techniques for querying and manipulating both workflow schemas and workflow enactments. Is there an "algebra" or "calculus" for building workﬂow schemas from other workﬂow schemas? Extend results from scientific workflow to other application areas, where workflows generally have side-effects. W7: Provenance, both coarse- and fine-grained. Find ways to combine the coarse-grained provenance research in scientific workflow, and the fine-grained provenance work from the database community. W8: Temporal aspects of workflow. Workflow schemas describe how data and processes are to occur over time, and workflow enactments include a history of what actions were taken through time. The common approaches to temporal databases do not appear well-suited for the kinds of querying, discovery, and manipulations that are needed in the context of workflow as data. Approaches such as PatternSQL, which enable queries over sequential patterns appear to have promise.
Research themes in ScienceWhile research in scientific workflow is somewhat advanced, and has already found its way into practice, this has served to stimulate the need for more research and a deeper understanding of these workflows. Some key research themes are now listed.
Research themes in BusinessSome key research themes in the business application area include the following:
Research themes in Digital GovernmentSome of the unique aspects of digital government are (a) the government must serve everybody; (b) the government is made up of numerous jurisdictions, many of which overlap; and (c) the government has highly sensitive information and is obligated to keep it secure and honor. There is also the issue of scale: taken in aggregate, the governments of larger nations arguably hold much more data and perform much more processing than any other kind of organization. Some key research themes in this area include:
Research themes in Healthcare DeliverySome key research themes here include the following.
High-level recommendations of the workshopThe workshop will also make some high-level recommendations for the field of research into workflow taken as a whole. Some tentative recommendations are as follows.
Page maintained by su (at) cs.ucsb.edu