ML Object Classifier using YOLOv11 with TTS

FILE:

v1.py

LANGUAGE & VERSION:

Python 3.13.7

OVERVIEW:

This application detects basic household objects using your webcam and classifies them into a general category by saying the name of the category out loud.

PURPOSE:

To experiment with AI/ML on a real hands-on project. Primarily built to compete in Technology Student Association (TSA) 2025-2026, in the Software Development category.

PIPELINE:

Upon clicking the start/stop button, the webcam receives input, as a singular picture frame. YOLO then processes the frame and makes predictions. A simple mathematical algorithm then calculates the largest/most important object detected. A TTS function then runs with the largest object's name as a parameter, so it can be processed into audio. The TTS function then outputs that audio through the computer's speakers. Meanwhile, the processed frame with bounding boxes around each detected object is converted in a multi-step process into a usable Tkinter format to be displayed on the application interface. This process then repeats very quickly (30 fps), until the start/stop button is clicked again.

FUNCTIONALITY:

Not all objects are able to be classified with accuracy and may be mistaken for other items, as this ML model is pre-trained on just 80 categories. The program may detect the object as something else, or may not detect it at all.

OUTPUT:

This program utilizes your webcam camera and your speakers, so make sure both are functioning. Your webcam will be running a live feed that you are able to see, and your speakers will be outputting audio every 3 seconds based on whatever the main object is.

DEPENDENCIES:

opencv-python (for webcam usage), ultralytics (for YOLOv11 ML model), pillow (for Python Imaging Library to process frames), edge-tts (for TTS usage), pygame (for audio engine usage)

TERMINAL COMMAND FOR DEPENDENCIES:

pip install opencv-python ultralytics pillow edge-tts pygame

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
LICENSE		LICENSE
README.md		README.md
v1.py		v1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Object Classifier using YOLOv11 with TTS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Object Classifier using YOLOv11 with TTS

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages