Multi-Agent Reinforcement Learning for Compute Allocation in Processing Networks

Thesis Proposal Details

Supervisor: Luca Ballotta

Creation Date: 30/10/2025 16:00

Description

Processing networks are multi-agent systems where each agent is a smart sensor equipped with sensing, compute, and communication capabilities. Agents acquire information from the environment, refine it via local compute (e.g., machine learning or computer vision), and transmit data to a workstation that implements global monitoring and coordinated decision-making for all robots. This setup is common in Internet-of-Things and Edge Computing, where smart sensors may represent collaborative robots for manufacturing and the workstation is a controller that commands robots' motion. Since smart sensors carry limited resources, they face a tradeoff between latency and accuracy when processing sensory measurements onboard: either they extract accurate information with extra delay, or they process data quickly at the cost of moderate extracted information. Clearly, balancing this tradeoff is key to success and high performance in time-critical applications.

In this thesis, you will use multi-agent reinforcement learning to find an efficient policy that decides how each smart sensor in a processing network refines – if at all – sensory data before transmitting them to the workstation, drawing inspiration from and extending the frameworks in [1,2]. The primary focus will be optimal global estimation/monitoring with use cases from robotic mapping and coverage. If progress and time permit, we may move from monitoring to decision-making. In any case, use cases will be agreed with the supervisor before starting the thesis to make sure they match your interests.

This thesis may involve a – possibly remote – collaboration with Purdue University.

  1. L. Ballotta, G. Peserico, F. Zanini, and P. Dini, “To Compute or Not to Compute? Adaptive Smart Sensing in Resource-Constrained Edge Computing,” IEEE Trans. Netw. Sci. Eng., 2024. Available at https://arxiv.org/abs/2209.02166

  2. V. Tripathi, L. Ballotta, L. Carlone, and E. Modiano, “Computation and Communication Co-Design for Real-Time Monitoring and Control in Multi-Agent Systems,” International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks, 2021. Available at https://arxiv.org/abs/2108.03122

Dataset and methods

Dataset type: Data to be acquired

Dataset description: You will mainly generate synthetic data, and possibly use public datasets on the web.

List of Methods: Python, pytorch, reinforcement learning, multi-agent reinforcement learning

Preparatory Courses

Machine learning, reinforcement learning. Optional: estimation and filtering, multi-agent systems, deep learning

Tags
edge computing energy efficiency machine learning multi-agent systems reinforcement learning resource allocation
Back to proposals list