Malik Talha

Journey into GSoC 2023

GSoC 2023 Final Report: Voice Assistant for Automotive Grade Linux (AGL)

30 Oct 2023

8 Minutes

This is the final report of my Google Summer of Code 2023 project with The Linux Foundation on Automotive Grade Linux (AGL). In this report, I will be discussing the work I have done in the past 6 months, the challenges I faced, and the things I learned.

AGL Introduction

Automotive Grade Linux (AGL) is an open-source software platform specifically designed for automotive applications. It provides a standardized and customizable solution for connected cars and infotainment systems. AGL offers a comprehensive set of features, including a Linux-based operating system, middleware, and application framework. It enables automotive manufacturers and developers to collaborate, accelerate innovation, and create secure and reliable software for vehicles. AGL fosters an ecosystem of industry stakeholders, promoting interoperability and driving the adoption of open-source software in the automotive industry.

Project Overview

My project revolves around enhancing the pre-existing speech recognition framework within the context of Automotive Grade Linux (AGL). The primary objective of this undertaking is the development and seamless integration of a sophisticated natural language intent engine. This engine is designed to comprehensively understand the underlying intent behind a user's voice command and facilitate the execution of said intent through a software agent. This ambitious endeavor represents a significant leap forward in the realm of in-vehicle voice interaction. Consequently, the project has been aptly named as the "Voice Assistant."

The Voice Assistant, a multifaceted project, combines cutting-edge speech recognition technologies and natural language processing capabilities to create a comprehensive voice-driven experience within automotive environments. It encompasses the ability to interpret a wide range of user commands, from basic vehicle control functions to more complex queries and requests, such as navigation, media control, and in-depth vehicle status inquiries. The aim of this project is to provide a seamless and intuitive voice interaction system that enhances the in-car experience, ultimately improving safety, convenience, and user satisfaction for AGL-equipped vehicles.

Interested in more technicals details and how to use the voice assistant? Check out the Official Automotive Grade Linux Documentation.

Project Architecture

The following diagram shows the architecture of the project:

Project Architecture

Planned Goals and Milestones

As per my GSoC project I planned to deliver the following goals and milestones:

  • 🔖 Objective # 01: Integration and development of Natural Language Understanding Intent Engine with existing Vosk implementation.
  • 🔖 Objective # 02: Ability to execute the intent extracted from the NLU engine using some sort of interface that will communicate with the APIs.

Deliverables

I was able to successfully deliver all the planned goals and milestones. The following are the deliverables of my GSoC project:

  • ✅ Deliverable # 01: Integration and development of Snips and RASA Natural Language Understanding Intent Engines.
  • ✅ Deliverable # 02: Ability to execute the intent extracted from the NLU engine using a Python and GStreamer based voice agent service.
  • ✅ Deliverable # 03: A flutter based IVI application that communicates with the voice agent service to record user voice command, transcribe it to text, extract intent from it, and execute the intent.

Project Gerrit Repositories

I have created and/or worked on the following AGL Gerrit repositories for my GSoC project:

Gerrit Individual Commits

I have compiled the list of all the individual commits that I have made during my GSoC project on Gerrit. The following are the links to the commits:

If you are familiar with Gerrit, you can also check out all of my commits at one place on Gerrit.

Official GSoC Project Link

You can check out my official GSoC project credentials here.

Learning Outcomes

Throughout this GSoC experience, I've gained invaluable insights and experiences that extend far beyond the technical aspects of the project and what I had imagined initially. Some of the new tools, technologies, and key lessons I've learned include:

  • Yocto Project: I learned how to use the Yocto Project to build a Linux distribution for embedded devices. I also learned how to create custom recipes and layers using the Yocto Project.
  • AGL Eco-System: I learned how to work with, collaborate on and develop for the AGL eco-system.
  • Flutter & Dart: I learned how to use Flutter and Dart to create linux based desktop applications. I also learned how to integrate and test flutter applications in the AGL distribution.
  • GStreamer: I learned how to use GStreamer to create and manage audio pipelines and about integrating GStreamer with Python.
  • gRPC: I learned how to use gRPC to create a client-server based voice agent service using Dart and Python.
  • Problem Solving: Tackling unexpected challenges and debugging issues taught me to persevere in the face of adversity and adapt to evolving project requirements.
  • Open Source Culture: Immersing myself in open-source culture has shown me the importance of giving back to the community, sharing knowledge, and promoting transparency.
  • Continuous Learning: GSoC has instilled in me the habit of continuous learning, as I encountered new technologies and coding practices throughout the program.

What's Next?

As I reflect on the journey of developing the Voice Assistant for AGL during GSoC, I am eager to continue my contributions and further refine the project.

  • I am committed to the continuous improvement of the Voice Assistant. This includes addressing any sort of feedbacks, fixing bugs, and staying current with advancements in natural language processing and speech recognition technologies.
  • I plan to remain engaged with the AGL community and help any new contributors who are interested in the project.

Acknowledgments:

In closing, I would like to extend my heartfelt gratitude to the following individuals who played pivotal roles in my GSoC journey:

  • Jan-Simon Möller: Your unwavering guidance, expertise, and encouragement were instrumental in my project's success.
  • Scott Murray: Your constant support and invaluable insights enriched my learning experience.
  • Walt Miner: Your mentorship and feedback were invaluable to my project's success.
  • Marius Vlad: I appreciate your mentorship and the feedback you provided during this journey.

I am sincerely appreciative of the knowledge and opportunities you have provided, and I look forward to continuing our collaboration in the open-source community. Additionally, I'd like to thank the wider open-source community for their continuous inspiration and collaboration.

TwitterGitHubLinkedIn

© 2023 Malik Talha, All rights reserved.