Speech, technology and research lab

Communicating with, and through, computer applications

The Speech Technology and Research (STAR) Laboratory brings together a multidisciplinary mix of engineers, computer scientists and linguists. Together, our experts build systems for a wide range of applications including signal processing; data indexing and mining; and computer-aided learning. SRI’s speech and language technologies allow us to interact more naturally with computing applications and provide a wealth of actionable information about our intentions, health, and emotional state.

Core technologies and applications

Real-world impact

March 26, 2024

SRI’s AI-driven voice analysis could help screen for mental health conditions

Researchers at SRI are developing tools to help clinicians keep a close eye on depression, PTSD, and other mental health issues.
October 16, 2023

SRI is developing textiles that record audio

Turning piezoelectric materials and lithium-ion batteries into thread, innovators will weave fabrics that record sound.
July 5, 2022

Nuance Partners with SCIENTIA Puerto Rico

SRI spin-out Nuance Communications to expand access its Dragon Medical One for the island’s physicians and nurses

Featured researchers

September 8, 2021

Dimitra Vergyri

Director, Speech Technology and Research Laboratory (STAR)
September 8, 2021

Horacio Franco

Chief Scientist, Speech Technology and Research Laboratory
September 8, 2021

Aaron Lawson

Assistant Laboratory Director, Speech Technology and Research Laboratory
September 8, 2021

Martin Graciarena

Technical Manager, Speech Technology and Research Laboratory
September 8, 2021

Mitchell McLaren

Senior Computer Scientist, Speech Technology and Research Laboratory
September 8, 2021

Harry Bratt

Senior Computer Scientist, Speech Technology and Research Laboratory

Platforms

Publications

November 18, 2022

Toward Fail-Safe Speaker Recognition: Trial-Based Calibration with a Reject Option

In this work, we extend the TBC method, proposing a new similarity metric for selecting training data that results in significant gains over the one proposed in the original work.
October 1, 2021

Resilient Data Augmentation Approaches to Multimodal Verification in the News Domain

Building on multimodal embedding techniques, we show that data augmentation via two distinct approaches improves results: entity linking and cross-domain local similarity scaling.
July 27, 2021

Natural Language Access: When Reasoning Makes Sense

We argue that to use natural language effectively, we must have both a deep understanding of the subject domain and a general-purpose reasoning capability.

Communicating with, and through, computer applications

Core technologies and applications

Speech recognition .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Speech & audio analytics .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Machine translation .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Natural language understanding .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Information extraction .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Real-world impact

Featured researchers

Platforms

Open Language Interface for Voice Exploitation (OLIVE) .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

SenSay .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

DynaSpeak® speech recognition engine .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

EduSpeak® speech recognition toolkit .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

SRI Language Modeling (SRILM) .cls-1, .cls-2 { stroke-width: 0px; } .cls-2 { fill: #231f20; } .cls-1 { stroke-width: 0px; }

Publications