Invited Talks | Tao Li

2025

Digital-Twin-Enabled Predictive Traffic Sensing via Multi-Agent Risk-Constrained Online Learning

In The Annual East Coast Optimization Meeting, Center for Mathematics and Artificial Intelligence, George Mason University, Arlington, VA, Apr 2025

Abs arXiv

We investigate real-time cooperative traffic sensing using pan-tilt cameras (PTC) in urban transportation networks for traffic digital twin (DT) synchronization and predictive mobility management. Each PTC is tasked with 1) real-time information acquisition: dynamically tilting its directions to capture maximum traffic flows across the network in coordination with other cameras to synchronize DT with the physical world, and 2) predictive surveillance: tilting its directions to monitor edges with high driving risks predicted by the DT before any incidents actually occur. We formulate the real-time PTC control problem as an online constrained optimization problem, allowing the PTC agents to dynamically balance two disparate yet closely related traffic sensing tasks and achieve proactive surveillance online. We propose the multi-agent risk-constrained online learning (MARCOL) algorithm to enable multiple PTC agents to solve the online constrained optimization collectively without intra-agent communication. The novelty of MARCOL lies in the correlated Lagrangian multiplier across the agent and the resulting primal-dual updates based on mirror descent, which helps the agent coordinate their tilting strategies. Since evaluating the Lagrangian function requires the network-level traffic states, MARCOL employs the DT to simulate and predict the underlying states using past camera observations. We conclude the presentation with theoretical insights on the asymptotics of the learning dynamics under a stylized setup and discuss its connection with the Berk-Nash equilibrium.
Risk-Aware Long-Short-Term Adaptive Twining in Urban Traffic Digital Twin

In Workshop on Uncertainty Quantification Strategies for Multi-Physics Systems and Digital Twins, Institute for Mathematical and Statistical Innovation, University of Chicago, Chicago, IL, Feb 2025

Abs arXiv

An urban digital twin (DT) can predict future traffic states by leveraging real-time urban traffic data collected by sensing infrastructure, enabling the simulation and testing of novel configurations and fostering proactive mobility management. To accurately replicate the mobility states, DT requires agent-based simulation techniques, such as the Simulation of Urban Mobility (SUMO), modeling individual vehicles and allowing for the observation of complex interactions and behaviors typical in real-world traffic. However, one instance of SUMO simulation of large-scale traffic networks, which we call twining, consumes substantial computation resources. Hence, within a given service horizon, an economical choice is to run fewer long-term simulations, which may incur a loss of fidelity due to less frequent calibration. On the other hand, short-term twining is more responsive to traffic disturbances and offers accurate replicas at the cost of higher computation overhead. To get the best of the two worlds, we propose a long-short-term twining (LSTT) mechanism, a self-adaptive twining scheme that automatically switches between long and short-term periods. We develop an event-triggered control strategy, where LSTT evaluates the simulation risk by comparing collected real-time traffic information with the simulated one and adjusts the twining period accordingly.

2024

Online Optimization Meets Urban Transportation

In Student Learning Hub Seminar Series, C2SMART Center, New York University, New York, NY, Nov 2024

Abs Video Code

Urban transportation networks are inherently complex and dynamic, characterized by intricate road connections and diverse network structures coupled with time-variant traffic demands and frequent traffic incidents. Hence, offline planning or designing alone cannot guarantee real-time operational control and management of urban transportation systems, which may fail when physical attacks, unforeseen conditions, or unanticipated use places the system outside the design envelope. A desired real-time operation mechanism must adapt to the dynamic environment and determine management decisions to be executed while a system is running; i.e., input data arising over time have to be processed, and decisions have to be made before all input data are known. Such a decision-making process falls within the realm of online optimization or online learning. Motivated by several intelligent transportation applications from our past research projects, this tutorial aims to provide a gentle introduction to online optimization methods with much emphasis on the intuitive insights and relevance to transportation applications. The tutorial starts with gradient descent algorithms in conventional convex optimization and then moves to online gradient descent in online optimization problems. Extending from the single-agent online optimization, we briefly touch upon multi-agent online learning and associated equilibrium convergence. We conclude the tutorial by discussing the openings and challenges when deploying online optimization in urban transportation systems.
Towards Agent-Based Autonomous Network Security

In IEEE COMSOC TCCN Rising Star Symposium Series, Stevens Institute of Technology, Hoboken, NJ, Nov 2024

Abs Video

Security of cyber-physical network systems, such as 5G/6G communication networks, vehicular networks, and the Internet of Things, has become increasingly critical nowadays. Traditional security mechanisms rely primarily on manual operations, which can be slow, expensive, and ineffective in the face of the dynamic landscape of adversarial threats. This problem will only be exacerbated as attackers leverage artificial intelligence (AI) to automate their workflows. As a countermeasure, safeguarding critical network systems also calls for autonomous defensive operations that delegate security decisions to AI agents. This talk presents our agent-based framework for autonomous attack detection and response using reinforcement learning (RL) and large language models (LLM). To address conventional RL’s reactive nature, we propose a new RL paradigm, conjectural online RL (coRL), to equip the security agent with predictive power when dealing with the agent’s epistemic uncertainty over the attacker’s presence and actions. The intuition of coRL is to endogenize the epistemic uncertainty as part of the RL process: the agent maintains an internal world model as a conjecture of the uncertainty, and the learned conjecture produces valid predictions consistent with environment feedback induced by epistemic uncertainty. To mitigate the RL agent’s reliance on stylized modeling and textual data pre-processing, we further incorporate LLMs into the agentic framework to deliver end-to-end autonomous cyber operations. We finally conclude the talk by discussing the path ahead to building fully autonomous security agents.
Conjectural Online Learning in Asymmetric Information Stochastic Games

In Systems Engineering Department Seminar Series, City University of Hongkong, Hong Kong, Oct 2024

Abs HTML

Modern socio-technical network systems powered by artificial intelligence (AI) technologies feature sophisticated interactions among humans, AI agents, and system entities. Asymmetric information stochastic games (AISG) provide principled mathematical modeling for such interactions, leading to game-theoretical mechanisms for network management. However, existing computational and learning methods in asymmetric information stochastic games (AISG) are primarily offline without adaptability to online nonstationarity, which falls short of proactive intelligence for resilient network management. To address these limitations, we propose conjectural online learning (COL), an online learning framework for generic AISGs. COL uses a forecaster-actor-critic (FAC) architecture, where the forecaster conjectures the other agents’ strategies and system dynamics within a look-ahead horizon, representing the agent’s subjective (mis)perception of the AISG. Based on these subjective perceptions, COL employs online rollout (actor-critic) to improve the policy. Bayesian learning is then used to calibrate the conjectures using information feedback. We establish that the conjectures produced by COL are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We deploy COL in a nonstationary IT infrastructure digital twin, which delivers online adaptable defense against advanced persistent threats compared with benchmark reinforcement learning techniques.
Agent of Agents: Meta LLM-Agent for Autonomous Security Operations

In NSF Workshop on Large Language Models for Network Security, Center for Cybersecurity, New York University, New York, NY, Oct 2024

Abs HTML

Today’s security operations are largely manual, slow, costly, and often ineffective. There is growing interest in moving toward more autonomous agents that can handle security operations with greater efficiency and lower operational costs. These agents can make decisions autonomously, adapt to new threats, and provide real-time automated responses. This shift is part of a broader transition from traditional rule-based engines to more sophisticated AI-driven engines, such as those powered by Reinforcement Learning (RL) and Large Language Models (LLMs). In this talk, we present a prototyping framework of he Meta Agent, or the Agent of Agents. A Meta Agent is essentially a mosaic of specialized agents, each focused on particular tasks, resulting in a cost-efficient and customizable agentic solution. By synthesizing insights from various agents, the Meta Agent can tackle complex tasks that no single agent could manage on its own. In addition to the Meta Agent concept, the integration of game theoretic methods and LLMs also provides a symbiotic framework for cybersecurity operations. From the bottom up, game models provide a strategic framework for analyzing and defining high-level goals. From the top down, LLMs can take these strategic commands and translate them into operational tactics, allowing the system to execute the desired actions effectively. This integration of strategy and operation offers a more holistic approach to managing security operations, ensuring that high-level decisions are carried out with precision at the tactical level.
Conjectural Online Learning with First-order Beliefs in Stochastic Games

In Coordinated Science Laboratory, University of Illinois Urbana-Champaign, Champaign, IL, Aug 2024

Abs HTML

Existing computational methods for asymmetric information stochastic games (AISG) are primarily offline and can not adapt to equilibrium deviations. Further, current methods are limited to particular information structures to avoid belief hierarchies. Considering these limitations, we propose conjectural online learning (COL), an online learning method under generic information structures in AISGs. COL uses a forecaster-actor-critic architecture, where subjective forecasts are used to conjecture the opponents’ strategies within a lookahead horizon, and Bayesian learning is used to calibrate the conjectures. To adapt strategies to nonstationary environments based on information feedback, COL uses online rollout with cost function approximation (actor-critic). We prove that the conjectures produced by COL are asymptotically consistent with the information feedback in the sense of a relaxed Bayesian consistency. We also prove that the empirical strategy profile induced by COL converges to the Berk-Nash equilibrium, a solution concept characterizing rationality under subjectivity.
Automated Security Response Through Conjectural Online Learning under Information Asymmetry

In Autonomous Robotics and Control Laboratory, California Institute of Technology, Pasadena, CA, Jun 2024

Abs HTML

We study automated security response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed, non-stationary game. We relax the standard assumption that the game model is correctly specified and consider that each player has a probabilistic conjecture about the model, which may be misspecified in the sense that the true model has probability 0. This formulation allows us to capture uncertainty and misconception about the infrastructure and the intents of the players. To learn effective game strategies online, we design Conjectural Online Learning (COL), a novel method where a player iteratively adapts its conjecture using Bayesian learning and updates its strategy through rollout. We prove that the conjectures converge to best fits, and we provide a bound on the performance improvement that rollout enables with a conjectured model. To characterize the steady state of the game, we propose a variant of the Berk-Nash equilibrium. We present COL through an advanced persistent threat use case. Testbed evaluations show that COL produces effective security strategies that adapt to a changing environment. We also find that COL enables faster convergence than current reinforcement learning techniques.
Multi-level Traffic-responsive Tilt Camera Surveillance through Predictive Correlated Online Learning

In NYU Urban Research DAy, The Robert F. Wagner Graduate School of Public Service, New York University, New York, NY, Mar 2024

Abs arXiv HTML Poster

In urban traffic management, the primary challenge of dynamically and efficiently monitoring traffic conditions is compounded by the insufficient utilization of thousands of surveillance cameras along the intelligent transportation system. This paper introduces the multi-level Traffic-responsive Tilt Camera surveillance system (TTC-X), a novel framework designed for dynamic and efficient monitoring and management of traffic in urban networks. By leveraging widely deployed pan-tilt-cameras (PTCs), TTC-X overcomes the limitations of a fixed field of view in traditional surveillance systems by providing mobilized and 360-degree coverage. The innovation of TTC-X lies in the integration of advanced machine learning modules, including a detector-predictor-controller structure, with a novel Predictive Correlated Online Learning (PiCOL) methodology and the Spatial-Temporal Graph Predictor (STGP) for real-time traffic estimation and PTC control. The TTC-X is tested and evaluated under three experimental scenarios (e.g., maximum traffic flow capture, dynamic route planning, traffic state estimation) based on a simulation environment calibrated using real-world traffic data in Brooklyn, New York. The experimental results showed that TTC-X captured over 60% total number of vehicles at the network level, dynamically adjusted its route recommendation in reaction to unexpected full-lane closure events, and reconstructed link-level traffic states with best MAE less than 1.25 vehicle/hour. Demonstrating scalability, cost-efficiency, and adaptability, TTC-X emerges as a powerful solution for urban traffic management in both cyber-physical and real-world environments.

2022

On the Role of Information Structures in Multi-agent Learning

In The 33th International Conference on Game Theory, Stony Brook Center for Game Theory, Stony Brook, NY, Jul 2022

Abs HTML

Multi-agent learning (MAL) studies how agents learn to behave optimally and adaptively from their experience when interacting with other agents in dynamic environments. Information structures play a significant role in the learning mechanisms of the agents. This review creates a taxonomy of MAL and establishes a unified and systematic way to understand MAL from the perspective of information structures. We define three fundamental components of MAL: the information structure (i.e., what the agent can observe), the belief generation (i.e., how the agent forms a belief about others based on the observations), as well as the policy generation (i.e., how the agent generates its policy based on its belief). This perspective allows addressing one pivotal challenge in MAL, i.e., the non-stationarity of the environment when agents update their strategies concurrently. In addition, this taxonomy enables the classification of a wide range of state-of-the-art algorithms into four categories based on the belief-generation mechanisms of the opponents, including \emphstationary, conjectured, calibrated, and \emphsophisticated opponents. We introduce \emphValue of Information (VoI) as a metric to quantify the impact of different information structures on MAL. Finally, we discuss the strengths and limitations of algorithms from different categories and point to promising avenues of future research.
Informationally Mosaic Reinforcement Learning

In Special Session on Markov Descision Processes, SIAM 2022 Annual Meeting, Pittsburgh, David L. Lawrence Convetion Center, Pittsburgh, PA, Jul 2022

Abs PDF

Multi-agent Reinforcement learning (MARL) has shown encouraging successes in addressing the sequential decision-making problem of multiple autonomous agents within a dynamic environment. The key to its successes is that MARL enables agents to adjust strategies based on their perceptions of the surroundings and the feedback from the environment. We refer to the structure of feedbacks and perceptions as the information structure of MARL. To achieve a broader deployment in reality, MARL must be able to adapt agents to varying information structures. The issue of learning under unknown, dynamic, and generally amorphous information structures poses a great challenge to current MARL studies. To address it, we propose a novel framework, Informationally Mosaic Multi-Agent Reinforcement Learning (IMMARL), where agents with different information structures coordinate in an unprescribed way to explore and utilize constructive information from the environment. In particular, the agent’s exploration operates in a laissez-faire manner, that is, it voluntarily rewards others for discovering and sharing helpful information. The proposed framework brings up flexible interoperability, and increases the modularity in MARL systems. We introduce a novel metric, Value of Information (VoI), to quantify the importance of informational exploration during learning. We corroborate on the proposed IMMARL and VoI using experiments conducted in procedurally-generated benchmark environments.

2020

Multi-Agent Correlated Learning over Networks

In Special Session on Game Theoretic Learning in Networks, INFORMS Annual Meeting, Online, Nov 2020

Abs

We study game-theoretic learning over networks, which is of great importance when studying multi-agent decision-making and it has been widely applied to various problems, ranging from smart grids, supply chain management and autonomous vehicles to cyber security and social networks analysis. Prior works on game-theoretic learning have not paid enough attention to the topology of the network as well as the relationship between local interactions within neighborhoods and overall correlations among all agents on the network. For example, centralized approaches rely on the global information, i.e., information about each individual in the networked system, which seems not practical or efficient in large and complex systems, whereas fully decentralized approaches, though more self-dependent, are unable to tell how the mutual influence among agents leads to a desired coordination. In this presentation, we shall show that how local correlations within neighborhoods result in a correlated learning over the whole network system, where the influence of each agent’s action is propagated through the underlying network, leading to coordinated behaviors of networked agents. More specifically, three aspects are included in this presentation. We first present the mathematical models for games over networks and introduce related game theory basics as well as necessary tools for analyzing strategic interactions among game players. Then we provide various interpretations of correlations over networks in different contexts, involving correlated equilibrium and correlated mechanism over networks. We point out those interpretations make such a game model quite flexible in the sense that it can be leveraged to model strategic interactions among participants of organizations or systems arising from different scenarios, such as economic or social institutions and wireless networks. Finally, we move to correlated learning over networks and we argue that it is not only a descriptive model for explaining how the correlations over networks are obtained from non-equilibrium dynamic behaviors but also sheds a new light on the prescriptive design of learning schemes for the networked agents so as to achieve certain desired correlations. We conclude this presentation by briefly discussing the confluence of game-theoretic learning, network systems and artificial intelligence (AI), which gives a promising route for the further development of game-theoretic learning in the era of modern AI.

2017

Directional Framelets and its Application in Medical Imaging

In PIMS-AMI Workshop on Applied Harmonic Analysis, University of Alberta, Edmonton, Canada, Aug 2017

Abs HTML

A directional compactly supported d-dimensional Haar tight framelet is constructedsuch that all its high-pass filters in its underlying tight framelet filter bank haveonly two nonzero coefficients with opposite signs and they exhibit totally (3^d - 1)/2 directions in dimension d. Furthermore, applying the projection method to sucha tight framelet, a directional compactly supported box spline tight framelet withsimple geometric structure is built such that all the high-pass filters in its underlyingtight framelet filter bank have only two nonzero coefficients with opposite signs aswell. Moreover, such compactly supported box spline tight framelets can achievearbitrarily high numbers of directions by using refinable box splines with increasingsupports. Their application to pMRI with good performance is presented