Swagath Venkataramani

Overview

Title

Principal Research Scientist, AIU Architecture and Compilers

Location

IBM Research - Yorktown Heights Yorktown Heights, NY USA

Publications

A software-assisted peak current regulation scheme to improve power-limited inference performance in a 5nm AI SoC
- - Monodeep Kar
  - Joel Silberman
  - et al.
- 2024
- ISSCC 2024
Deep Compression of Pre-trained Transformer Models
- - Naigang Wang
  - Chi-Chun Liu
  - et al.
- 2022
- NeurIPS 2022
Approximate computing and the efficient machine learning expedition
- - Jörg Henkel
  - Hai Li
  - et al.
- 2022
- ICCAD 2022
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators
- - Subhankar Pal
  - Swagath Venkataramani
  - et al.
- 2022
- Transactions on Embedded Computing Systems
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2022
- INTERSPEECH 2022
Accelerating DNN Training Through Selective Localized Learning
- - Sarada Krithivasan
  - Sanchari Sen
  - et al.
- 2022
- Frontiers in Neuroscience
A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
- - Sae Kyu Lee
  - Ankur Agrawal
  - et al.
- 2021
- IEEE JSSC
4-bit quantization of LSTM-based speech recognition models
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2021
- INTERSPEECH 2021
Efficacy of Pruning in Ultra-Low Precision DNNs
- - Sanchari Sen
  - Swagath Venkataramani
  - et al.
- 2021
- ISLPED 2021
RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference
- - Swagath Venkataramani
  - Vijayalakshmi Srinivasan
  - et al.
- 2021
- ISCA 2021

Patents

- 06 Dec 2022
- GB
- 2590000
Low Precision Deep Neural Network Enabled By Compensation Instructions
- 20 Oct 2022
- JP
- 7163381
Facilitating Neural Network Efficiency
- 29 Aug 2022
- US
- 11429524
Optimized Hierarchical Scratchpads For Enhanced Artificial Intelligence Accelerator Core Utilization
- 06 Jun 2022
- US
- 11354573
Dynamically Resizing Minibatch In Neural Network Execution
- 28 Feb 2022
- US
- 11263518
Bi-scaled Deep Neural Networks
- 06 Dec 2021
- US
- 11195096
Facilitating Neural Network Efficiency
- 29 Nov 2021
- US
- 11188820
Deep Neural Network Performance Analysis On Shared Memory Accelerator Systems
- 14 Jun 2021
- US
- 11037650
Self-evaluating Array Of Memory
- 24 May 2021
- US
- 11016840
Low-overhead Error Prediction And Preemption In Deep Neural Network Using Apriori Network Statistics
- 16 Nov 2020
- US
- 10838868
Programmable Data Delivery By Load And Store Agents On A Processing Chip Interfacing With On-chip Memory Components And Directing Data To External Memory Components

Top collaborators

Xiaodong Cui

Principal Research Scientist

Alberto Mannari

Software Developer

Jinwook Jung

Research Staff Member

Mori Ohara

Deputy Director, IBM Research Tokyo, Distinguished Engineer, Chief SW Engineer for Hybrid Cloud on IBM HW