Democratizing ML for speech

Practical AI - Een podcast door Practical AI LLC

Podcast artwork

Categorieën:

You might know about MLPerf, a benchmark from MLCommons that measures how fast systems can train models to a target quality metric. However, MLCommons is working on so much more! David Kanter joins us in this episode to discuss two new speech datasets that are democratizing machine learning for speech via data scale and language/speaker diversity.Join the discussionChangelog++ members save 3 minutes on this episode because they made the ads disappear. Join today!Sponsors:Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this! The Brave Browser – Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token. Download Brave for free and give tipping a try right here on changelog.com. Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.comFeaturing:David Kanter – GitHub, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:Press Release about MLCommons datasets: MLCommons™ Association Unveils Open Datasets and Tools to Drive Democratization of Machine LearningNeurIPS Papers: People’s Speech DatasetMultilingual Spoken Words Corpus (MSWC)Gradient article: New Datasets to Democratize Speech Recognition TechnologyBlog posts for more insight: People’s SpeechMultilingual Spoken Words Corpus (MSWC)Downloads: People’s SpeechMultilingual Spoken Words CorpusSomething missing or broken? PRs welcome! ★ Support this podcast ★

Visit the podcast's native language site