Delft University of Technology

PhD Position Tropical Kernel-Based Offline Reinforcement Learning – Delft University of Technology – Delft

Jobid=a8a8b354e930 (0.018)

This PhD position focuses on developing tropical (max-plus) kernel-based function approximation methods for offline reinforcement learning (RL).

Job description

Offline reinforcement learning (RL) seeks to derive effective sequential decision-making policies exclusively from pre-existing datasets, circumventing the need for potentially costly, risky, or time-consuming online environment interaction. This paradigm holds significant promise for leveraging large data repositories in domains such as healthcare, autonomous systems, and robotics. However, existing offline RL methods often rely on kernel or neural approximators whose inductive biases are poorly matched to Q-function geometry, resulting in slow convergence and sample inefficiency. This project proposes a novel approach to offline RL by utilizing tropical (i.e., max-plus) kernel-based function approximation for the Q-function. The core motivation stems from the inherent structural compatibility between the Bellman operator, central to dynamic programming and RL, and the operations of max-plus algebra. To overcome the potential representational limitations of purely max-plus-linear functions, we integrate kernel methods, which implicitly map data to high-dimensional feature spaces, allowing for richer, non-linear approximations.

What you will do

  • Establish the theoretical foundation and design scalable algorithms for tropical kernel-based offline RL.
  • Identify MDP classes whose optimal Q-function lies in tropical function spaces.
  • Develop representer-type theorems for kernel-based approximation within these spaces.
  • Address computational challenges inherent in kernel methods in tropical function spaces (e.g., model-size explosion and quadratic constraint growth) by devising tropical analogues of scalable kernel approximation techniques.
  • Develop efficient optimization solvers for the resulting regression problems.

Teaching activities

Teaching activities are part of your PhD trajectory and may include, for example: supervising workgroups or lab sessions, assisting in courses, or mentoring BSc and MSc students. While teaching will not be your main responsibility, it offers valuable experience that supports your development and prepares you for future academic or professional roles. Teaching activities will not exceed 20% of your total appointment, averaged over the course of your PhD.

Job requirements

The position is well-suited for candidates who sit at the intersection of theoretical machine learning, reinforcement learning, and optimization. Candidates are expected to carry out theory-driven RL research using kernels, optimization, and tropical methods. Therefore, a solid mathematical background and, more importantly, the willingness to further develop it as the project requires is essential. This position is not suited for those whose main interest is purely empirical deep RL benchmarking without an appetite for theoretical work. To be precise, here are the minimum requirements for this position:

  • A relevant MSc degree in systems and control, computer science, engineering, applied mathematics, or a related field.
  • Solid mathematical background, i.e., comfort with linear algebra, analysis, probability, optimization, and ideally some functional analysis or approximation theory.
  • Experience with (convex) optimization and algorithm design.
  • Experience with kernel methods is a plus but not required.
  • Experience with tropical methods is not required, but an interest in this aspect is important.
  • Excellent command of the English language and communication skills.

TU Delft (Delft University of Technology)

Delft University of Technology is a top international university combining science, engineering and design. It delivers world class results in education, research and innovation to address challenges in the areas of energy, climate, mobility, health and digital society.

Faculty Mechanical Engineering

The Faculty of Mechanical Engineering focuses on fundamental understanding, design, production (including application and product improvement), materials, processes and (mechanical) systems, supported by high-tech lab facilities and an international reach.

Conditions of employment

Doctoral candidates will be offered a 4-year period of employment in principle, but in the form of 2 employment contracts. An initial 1.5 year contract with an official go/no go progress assessment within 15 months. Followed by an additional contract for the remaining 2.5 years assuming everything goes well and performance requirements are met.

As a PhD candidate you will be enrolled in the TU Delft Graduate School. The TU Delft Graduate School provides an inspiring research environment with an excellent team of supervisors, academic staff and a mentor. The Doctoral Education Programme is aimed at developing your transferable, discipline-related and research skills.

The TU Delft offers a customisable compensation package, discounts on health insurance, and a monthly work costs contribution. Flexible work schedules can be arranged.

#J-18808-Ljbffr

Lees hier meer

Deel deze vacature: