The increasing adoption of cloud, edge and Internet of Things (IoT) technologies has led to the emergence of the cloud–edge compute continuum, enabling low-latency, data-intensive applications across highly distributed infrastructures. However, designing and operating high-performance data pipelines in such environments remains challenging due to heterogeneous resources, distributed data sources, stringent security requirements, and the need for efficient data movement and I/O provisioning. This paper presents an architecture for data management and I/O provisioning that supports the execution of high-performance data pipelines across the cloud–edge continuum. The proposed approach combines federated data management, intelligent data placement, and continuum-aware resource orchestration within a unified platform. These capabilities are integrated with pipeline runtime services that enable adaptive execution and optimization based on observed data access patterns and resource availability. The proposed architecture is evaluated using a real industrial data pipeline scenario from fiber laser cutting domain.