Files
Abstract
Modern computer networks generate extensive amounts of data that can benefit network research, management, and security. This data is fast-evolving, increasingly encrypted, and highly siloed, which makes it difficult to analyze using traditional methods based on predefined rules and signatures. Machine learning (ML) methods have shown promise in identifying complex patterns and insights in network data. Yet, these methods often face reliability issues in real-world network operations. This thesis addresses multiple practical challenges unique to integrating data-driven approaches in network operations: (1) the need to enrich traffic patterns from diverse data inputs, (2) the need for scalable platforms that support real-time decision-making for high-throughput data flows, (3) and the need to adapt to constantly changing network characteristics and user behaviors. By addressing these challenges, new opportunities arise for collaboration across multiple network modalities and entities, for performing data inference at tens of Gbps on general-purpose hardware, and for constantly adapting to different environments. It also discusses how overcoming these challenges can pave the way for a future that empowers all stakeholders—model developers, network operators, and network service users—to interpret and manage network interactions with greater reliability and transparency.