Joseph Keshet

Joseph ‘Yossi’ Keshet

Speech, Language, and Artificial Intelligence.

I am an Associate Professor at the Viterbi Faculty of Electrical and Computer Engineering at the Technion – Israel Institute of Technology, where I direct the Speech, Language, and AI Lab, affiliated with the Signal and Image Processing Lab (SIPL).

 

I am also Chief Scientist at aiOla. aiOla helps enterprises integrate Conversational AI through APIs or apps. aiOla’s technology adapts to any language, jargon, or accent, unlocking spoken data, streamlining workflows, and improving the user experience.

Full CV

My Research

I am passionate about human speech—one of the most fundamental yet intricately complex signals we know. Speech carries not only the information a speaker intends to convey but also reveals their identity, emotional state, and even medical condition. My drive to understand and quantify speech shapes my research interests, which span both machine learning and the computational study of human speech and language.

Blog

Publications

Transformer architectures, similar to those behind ChatGPT, are the foundation of today’s most successful speech recognition systems. Likewise, diffusion-based models have become key to state-of-the-art speech synthesis and generation. My recent publications focus on improving the performance and conceptual understanding of these architectures within the speech domain.
Sound

All publications

Courses

Speech Processing with Deep Learning

Speech recognition and synthesis have become foundational components in the development of modern artificial intelligence, bridging the gap between human communication and machine interaction. This course aims to provide a comprehensive introduction to the core principles of speech signal processing and modeling, essential for understanding how machines perceive and generate human language.

Deep Learning

This course covers key theoretical and practical tools for deep learning, with an emphasis on supervised learning. Topics include gradient descent, fully-connected and convolutional networks, training methods, and architectures for sequential data like transformers. We also explore techniques to improve data and resource efficiency, such as pre-training, self-supervised learning, quantization, and pruning.

Transformers and Large Language Models

Transformers are powerful deep learning architectures that have revolutionized how machines understand and generate sequential data. From enabling breakthroughs in language models like ChatGPT to powering cutting-edge systems in speech recognition and computer vision, Transformers have become a cornerstone of modern AI. This course explores their design, capabilities, and impact across a range of domains.

|ECE 460747|ECE 460217|ECE 0480011