Bio

I am a Marie Skłodowska-Curie fellow (MSCA COFUND IST-BRIDGE) at the Institute of Science and Technology Austria (ISTA), working in the group led by Prof. Dan Alistarh.

Before joining ISTA in 2022, I was a postdoctoral research fellow at King Abdullah University of Science and Technology (KAUST) in Saudi Arabia from 2019 to 2022, in the group led by Prof. Peter Richtárik. Prior to that, I worked with Prof. Diogo Gomes at KAUST as a research technician from 2016 to 2019. I obtained my Ph.D. in Mathematics in 2018 from Yerevan State University (YSU) in Armenia, under the supervision of Prof. Grigori Karagulyan.

Research Interests

  • optimization (theory and algorithms), machine learning, federated learning
  • large-scale, convex/non-convex, stochastic/deterministic optimization, variance reduction
  • communication/computation/memory effcient and scalable optimization algorithms
  • collaborative learning (asynchronous, adversarial, local training, heterogeneity, etc.)
  • model compression (knowledge distillation, pruning, sparse optimization, quantization)
  • information theory (compression, encoding schemes, vector quantization)

Research

My current research focuses on optimization theory and algorithms for machine learning, with an emphasis on efficiency, scalability, and the theoretical understanding of optimization methods. These methods are particularly relevant for large-scale machine learning training and federated learning. Driven by applications in machine learning, my publications appear primarily in leading machine learning conferences such as NeurIPS, ICML, AISTATS, and ICLR, as well as journals like JMLR and TMLR.

I completed my Ph.D. in real harmonic analysis, a branch of mathematics that explores the relationship between functions or signals and their frequency domain representations. My thesis investigated the convergence and divergence properties of certain convolution-type integral operators. I defended my thesis in 2018, with a total of five journal publications, including two single-authored papers and two published in The Journal of Geometric Analysis.

In addition, I have done some research in algebra. During my undergraduate studies at YSU, I completed a research project on universal algebraic structures called dimonoids, which led to a publication in Algebra and Discrete Mathematics. Later, at KAUST, I worked on symbolic computation, specifically on developing computer algebra techniques for automating certain aspects of PDE analyses.

For the complete list of my publications, please visit my Google Scholar page.

News

October 2024

New paper on arXiv.

LDAdam: Adaptive Optimization from Low-dimensional Gradient Statistics - joint work with Thomas Robert, Ionut-Vlad Modoranu, Dan Alistarh.

Abstract: We introduce LDAdam, a memory-efficient optimizer for training large models, that performs adaptive optimization steps within lower dimensional subspaces, while consistently exploring the full parameter space during training. This strategy keeps the optimizer's memory footprint to a fraction of the model size. LDAdam relies on a new projection-aware update rule for the optimizer states that allows for transitioning between subspaces, i.e., estimation of the statistics of the projected gradients. To mitigate the errors due to low-rank projection, LDAdam integrates a new generalized error feedback mechanism, which explicitly accounts for both gradient and optimizer state compression. We prove the convergence of LDAdam under standard assumptions, and show that LDAdam allows for accurate and efficient fine-tuning and pre-training of language models.

October 2024

Secondment with Neural Magic, Inc.

As part of the fellowship, I have started my secondment with Neural Magic, Inc. in the USA, working with Dr. Alexandre Marques on LLM compression.

Contacts

Building West, Level 1, 21-01-122, ISTA, Am Campus 1, 3400 Klosterneuburg, Austria

Mher Safaryan