Robust optimization via targeted sampling for machine learning
Professor: Debasish Chatterjee
UID: CM01
Optimal learning
Professor: Debasish Chatterjee
UID: CM02
CM01
Description: This project centers around an entirely
novel approach to robust optimization in the “convex” regime.
More specifically, we will explore specific applications of
convex semi-infinite programs, via a newly developed “targeted
sampling technique”, in machine learning, estimation and
statistics, portfolio optimization, etc. Students in their third
year of their UG studies and equipped with a deep sense of
curiosity are encouraged to apply. Efforts will be split equally
between learning new theory and developing numerical tools for
the aforementioned applications, and there is a strong
possibility of filing patents in each case.
Number of students: 2
Year of study: Students entering 3rd year, Students entering
4th/5th year
CPI: 8.5 and above
Prerequisites: Background in optimization and
probability.
Duration: 3 months through 3 years
Learning outcome: Patents in specific cases; introduction
to entirely novel ideas in robust optimization.
Weekly time commitment:At least 20 during the vacation,
at least 5 during the semester.
Instructions for assignment: The basic technique is
contained in the indicated article; specific cases will require
the development of fine-tuned theory/computational software.
CM02
Description:"This project centers around a novel approach
to learning from data a la function approximation, and the
Chebyshev center problem is the centerpiece in this context.
Given a compact subset K of R^n, the Chebyshev center problem
consists of finding a minimal circumscribing ball (a so-called
Chebyshev ball) containing K. In the context of learning from
data, the problem translates to finding the center of a
Chebyshev ball described by a dictionary of functions justifying
the data. Finding the Chebyshev ball is an NP hard problem in
general, but recent advances have established ideas for
numerically tractable sharp approximations and the non-noisy
case (i.e., optimal interpolation) has been worked out in
detail. This project is about carrying out the computations in
the case of noisy data. Efforts will be split equally between
learning new theory and developing numerical tools for the
aforementioned applications, and there is a strong possibility
of writing journal papers and filing patents. "
Number of students: 2
Year of study: Students entering 3rd year, Students
entering 4th/5th year
CPI: 8.5 and above
Prerequisites: Background in optimization and
probability.
Duration: 3 months extendable up to 3 years.
Learning outcome: Introduction to a novel numerical
apparatus and the challenging problem of optimal learning from
data.
Weekly time commitment:At least 20 during the vacation,
at least 5 during the semester.
Instructions for assignment: The first document contains
a discussion of the Chebyshev center problem in the context of
optimal learning (non-noisy case). Their assertion concerning
the quality of approximation possible (equation (3.11)) is
suboptimal, and the factor C there can be made equal to 1! The
germ of the this idea is in the second document. Study the first
carefully to get an idea of the math involved. The numerical
algorithm has to be designed along the lines of the second
document.