Khiem Pham


I am currently a Research Resident at Vinai Research working with Dr. Nhat Ho and Dr. Hung Bui. I am broadly interested in Statistical Machine Learning and especially scalable algorithms


Currently my main interest is Optimal Transport Theory (sample complexity theory, scalibility and Machine Learning applications). My secondary interests include (Deep) Probabilistic Modeling and Approximate Inference and Deep Representation Learning. Lately, I have also been playing with Binary Neural Networks.


Resume

Research

On Unbalanced Optimal Transport: Analysis of Sinkhorn Algorithm

We showed the near-linear time, dimension-independent computational complexity of the Sinkhorn algorithm for finding an epsilon-approximate solution to the entropic regularized Unbalanced Optimal Transport problem between two measures of possibly different masses.

with Khang Le, Nhat Ho, Tung Pham, Hung Bui
International Conference on Machine Learning, 2020
paper | slides | poster | implementation
Combining GHOST and Casper

We proposed and analyzed "Gasper", a proof-of-stake-based consensus protocol that will be used in the Ethereum 2.0 beacon chain.

with Vitalik Buterin, Diego Hernandez, Thor Kamphefner, Khiem Pham, Zhi Qiao, Danny Ryan, Juhyeok Sin, Ying Wang, Yan X Zhang
To be submitted to the Stanford Block Chain Conference, 2021
paper
Large-scale Spectral Clustering using Diffusion Coordinates on Landmark-based Bipatite Graph

We reduce the run-time of Spectral Clustering by sampling a set of representative points and form a bipartite graph between data and representatives. Different from previous work, these representative points are then clustered instead and the clustering is extended to original data. Experiments showed that the algorithm is both faster and more robust to sampling.

with Guangliang Chen.
NAACL-HLT 2018 12th Workshop on Graph-based Natural Language Processsing, New Orleans, Louisiana, June 2018
paper | slides | implementation
Evaluating Syntactic Properties of Seq2seq Output with a Broad Coverage HPSG: A Case Study on Machine Translation

Sequence to sequence (seq2seq) models are often employed in settings where the target output is natural language. However, the syntactic properties of the language generated from these models are not well understood. We explore whether such output belongs to a formal and realistic grammar, by employing the English Resource Grammar (ERG), a broad coverage, linguistically precise HPSG-based grammar of English. From a French to English parallel corpus, we analyze the parseability and grammatical constructions occurring in output from a seq2seq translation model. Over 93\% of the model translations are parseable, suggesting that it learns to generate conforming to a grammar. The model has trouble learning the distribution of rarer syntactic rules, and we pinpoint several constructions that differentiate translations between the references and our model.

with Johnny Wei, Brendan O'Connor, Brian Dillon.
EMNLP 2018 Workshop on Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium, October 2018
paper

Contact

Contact me at duckhiem95@gmail.com. Ask me anything!