Combinatorial algorithms for high dimensional statistics

  Speaker: Zhenming Liu, College of William & Mary

  Time: 14:00 - 15:00, June 19, Wednesday

  Place: Room 850, 8th floor, ICT, CAS


  This talk revisits the inference problem for the linear model y=Mx+?, where both x and y are vectors, M is the matrix to be inferred, and ? is a noise term. We focus on the high dimensional setting, in which the number of observations n is significantly smaller than the size of M. While this problem arises often in different areas, such as identification of biomarkers, understanding risks associated with various diseases, and image recognitions, we are specifically motivated by its application in forecasting equity return in the financial markets. Here, the response y is the future equity returns from a large universe (e.g., Zhongzheng 500). In the high-dimensional setting, most existing statistical models/algorithms aim to design suitable regularizers to achieve better variance-bias tradeoff.

  In this talk, we demonstrate that we can use combinatorial and graph-based techniques to solve high-dim problems. We examine a broader class of algorithmic problems that do not have convex objective so that we can effectively extract signals from a richer class of M. We present two results. First, we assume that M exhibits stochastic block structure and develop an inference algorithm inspired by Abraham, Chechik, Kempe, and Slivkins' algorithm for inferencing small world graphs. Second, we assume that M is low rank and develop a spectral-based algorithm that determines model complexity in a data-driven manner. Finally, we will evaluate the performance of these algorithms against an equity dataset, and discuss how these techniques can be applied to non-linear models.


  Zhenming Liu is an assistant professor in Computer Science in the College of William & Mary. He received his PhD at Harvard University in 2012 and was a postdoc at Princeton University. Before joining William & Mary, he served as a quant researcher in Two Sigma Investments (AUM: 50B USD). He was an intern in MSRA in 2008 and 2011. He received a number of best papers, including PKDD 2010 best student paper, Infocom 2015 best paper runner up, Fast 2019 best paper, SDM 2019 best applied data science paper. He was a recipient of the Rutherford fellowship (2018) from the Alan Turing Institute.