Files
Abstract
Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute—such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and cloud-hosted serverless computing frameworks can manage multi-step computations spanning federated collections of cloud, high-performance computing, and edge systems, but rely on simple abstractions that pose challenges when building applications composed of multiple distinct software with differing communication and patterns. This dissertation introduces new techniques for programming distributed science applications deployed across the computing continuum—research infrastructure that spans personal, cloud, edge, and high-performance computing (HPC) systems. TaPS, a benchmarking suite for reliable evaluation of parallel execution frameworks, is developed and used to investigate limitations in existing solutions. This investigation motivates the design of ProxyStore, a library that extends the pass-by-reference model to distributed applications with the goal of decoupling data flow from control flow. ProxyStore's object proxy paradigm enables the dynamic selection of different data movement methods, depending on what data are moved, where data are moved, or when data are moved—a long-standing challenge in distributed applications. Three high-level patterns—distributed futures, streaming, and ownership—extend the low-level proxy paradigm to support science applications spanning bioinformatics, federated learning, and molecular design, in which substantial improvements in runtime, throughput, and memory usage are demonstrated. Last, Academy, a federated agents system, supports the creation and deployment of pervasive autonomous agents, decentralizing control flow across stateful entities. These techniques encompass an open-source toolbox for developing novel and performant science applications and federated frameworks for the computing continuum.