I am part of Multi-cloud and AI Platform department of IBM Research - India.
I am currently working in the data preparation for training IBM's Large Language Model for Code (Code LLM). This data preparation includes all the cleaning and processing steps on the code data before it can be used for the model training. Data preparation is a critical step in building an LLM and plays a major role in determining the quality and capabilities of the trained LLM.
Prior to current work, I was involved in the project - IBM watsonx Code Assistant for Ansible Generation (WCA for Ansible). IBM WCA for Red Hat Ansible Lightspeed is a purpose-built generative AI tool that helps Ansible developers in creating Ansible playbooks by adding AI assisted automation to the playbook creation process. I was part of the team that was responsible for preparing good quality data for training the generative AI model in IBM WCA for Ansible.
My earlier projects in the lab included:
- Multi-objective optimization of ML models: Currently available AutoML frameworks mostly optimize ML models for a single objective (such as model accuracy). The goal of this work was to optimize the ML pipeline considering multiple objectives (such as accuracy and fairness) and obtain the pareto-optimal solutions for the same.
- Elastic Deep Learning: The goal of this effort was to efficiently perform distributed DL training in a varying hardware resource (compute, network etc.) environment such as Cloud. (arxiv link)
- Distributed Deep Learning: This involved co-development of distributed version of Torch based Imagenet application that was used to perform a DNN training of Imagenet-1K data with resnet-50 model using 256 GPUs. At that time, in 2017, CNN models yielded state of the art performance for this task.
- IBM FastForward project, where I was responsible for improving application performance on future exascale systems through the use of assembly programming and OpenMP 4.0 accelerator support.
- UBD-IBM Research Collaboration effort that aimed to perform joint research with Universiti Brunei Darussalam (UBD) in the area of weather/climate modelling, flood forecasting, renewable energy (solar/wind farm). My focus in this effort was on the following two work streams:
(1) Look at the computational aspects of the weather and flood models for operational forecasting purpose. As part of this work, we evaluated and optimized WRF performance on IBM Blue Gene systems for nested domain configurations. I also worked on developing a GPU implementation of IBM flood forecasting model.
(2) Dissemination of weather and flood forecasts through the Web and Mobile portals. - Benchmarking and performance improvement of HPC Challenge Benchmarks on HPC systems such as IBM Blue Gene.
- Development of a parallel version of Non-rigid Image Registration on the GPU using CUDA programming model.
- Optimization of linear algebra library BLAS on the Cell Processor. This required the development of various BLAS routines utlizing various novel hardware features of Cell processor. The Optimized BLAS library was part of the publicly available IBM CellSDK.
Prior to joining IBM Research, I worked in the development of cross-platform scalable graphics applications/libraries on large visualization systems.
Publications and Patents
My work resulted in 15 publications in peer-reviewed conferences and workshops and 10 (filed) patents.
Awards
- IBM Research Image Award (2017)
- IBM Chairman's Environmental Award (2017)
- Best Paper Award in HiPC 2010 Conference
- IBM Research Accomplishment Award (2023, 2017, 2013, 2011, 2007)
Education
5-yr Integrated M.Tech. in Mathematics and Computing from IIT Delhi in 2003.