Joseph ‘Yossi’ Keshet

Speech, Language, and Artificial Intelligence.

I am an Associate Professor at the Viterbi Faculty of Electrical and Computer Engineering at the Technion – Israel Institute of Technology, where I direct the Speech, Language, and AI Lab, affiliated with the Signal and Image Processing Lab (SIPL).

I am also Chief Scientist at aiOla. aiOla helps enterprises integrate Conversational AI through APIs or apps. aiOla’s technology adapts to any language, jargon, or accent, unlocking spoken data, streamlining workflows, and improving the user experience.

Full CV

My Research

I am passionate about human speech—one of the most fundamental yet intricately complex signals we know. Speech carries not only the information a speaker intends to convey but also reveals their identity, emotional state, and even medical condition. My drive to understand and quantify speech shapes my research interests, which span both machine learning and the computational study of human speech and language.

Blog

News

Workshop 20 May 2025

Publications

Transformer architectures, similar to those behind ChatGPT, are the foundation of today’s most successful speech recognition systems. Likewise, diffusion-based models have become key to state-of-the-art speech synthesis and generation. My recent publications focus on improving the performance and conceptual understanding of these architectures within the speech domain.

All publications

					
DRAX: Speech Recognition with Discrete Flow MatchingAviv Navon, Aviv Shamsian, Neta Glazer, Yael Segal-Feldman, Gill Hetz, Joseph Keshet, Ethan Fetaya
Preprint, 2025.

Beyond Transcription: Mechanistic Interpretability in ASRNeta Glazer, Yael Segal-Feldman, Hilit Segev, Aviv Shamsian, Asaf Buchnick, Gill Hetz, Ethan Fetaya, Joseph Keshet, Aviv Navon
Preprint, 2025.

Spectral Analysis of Diffusion Models with Application to Schedule DesignRoi Benita, Michael Elad, Joseph Keshet
The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025.

					
CarelessWhisper: Turning Whisper into a Causal Streaming ModelTomer Krichli, Bhiksha Raj, Joseph Keshet
Preprint, 2025.

Guiding an Automatic Speech Recognition Decoder Using Large Language ModelsEyal Cohen, Bhiksha Raj, Joseph Keshet
Preprint, 2025.

How Does a Deep Neural Network Look at Lexical Stress?Itai Allouche, Itay Asael, Rotem Rousso, Vered Dassa, Ann Bradlow, Seung-Eun Kim, Matthew Goldrick, Joseph Keshet
Preprint, 2025.

Courses

						Speech Processing with Deep LearningSpeech recognition and synthesis have become foundational components in the development of modern artificial intelligence, bridging the gap between human communication and machine interaction. This course aims to provide a comprehensive introduction to the core principles of speech signal processing and modeling, essential for understanding how machines perceive and generate human language.

									Read More
								
											Deep LearningThis course covers key theoretical and practical tools for deep learning, with an emphasis on supervised learning. Topics include gradient descent, fully-connected and convolutional networks, training methods, and architectures for sequential data like transformers. We also explore techniques to improve data and resource efficiency, such as pre-training, self-supervised learning, quantization, and pruning.

									Read More
								
											Transformers and Large Language ModelsTransformers are powerful deep learning architectures that have revolutionized how machines understand and generate sequential data. From enabling breakthroughs in language models like ChatGPT to powering cutting-edge systems in speech recognition and computer vision, Transformers have become a cornerstone of modern AI. This course explores their design, capabilities, and impact across a range of domains.

									Read More

|ECE 460747 / DDS 970201|ECE 460217|ECE 0480011

Joseph ‘Yossi’ Keshet

Speech, Language, and Artificial Intelligence.

My Research

News

iSpeech 2025

The New Supercomputer “Israel-1” by NVIDIA

Why Amazon Scholar Yossi Keshet remains ‘excited about speech’

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Publications

DRAX: Speech Recognition with Discrete Flow Matching

Beyond Transcription: Mechanistic Interpretability in ASR

Spectral Analysis of Diffusion Models with Application to Schedule Design

CarelessWhisper: Turning Whisper into a Causal Streaming Model

Guiding an Automatic Speech Recognition Decoder Using Large Language Models

How Does a Deep Neural Network Look at Lexical Stress?

Courses

Speech Processing with Deep Learning

Deep Learning

Transformers and Large Language Models