Machine Learning Beginner

Object Detection and Tracking with YOLO

Dive into the world of computer vision with this beginner-friendly, project-based course. Learn how to leverage the powerful YOLO (You Only Look Once) architecture to build real-time AI applications from scratch. Through hands-on exercises, you will master everything from running pre-trained models and creating custom datasets to implementing robust object tracking. By the end of this journey, you will have the practical skills to build and deploy your own intelligent vision systems.

7 Weeks
Project-Based Learning

About this Course

Dive into the world of computer vision with this beginner-friendly, project-based course. Learn how to leverage the powerful YOLO (You Only Look Once) architecture to build real-time AI applications from scratch. Through hands-on exercises, you will master everything from running pre-trained models and creating custom datasets to implementing robust object tracking. By the end of this journey, you will have the practical skills to build and deploy your own intelligent vision systems. This comprehensive course is designed to take you from beginner concepts to advanced mastery. You will build real-world skills through hands-on practice and expert guidance.

Course Syllabus

Module 1: Introduction to Computer Vision and the YOLO Ecosystem

  • The fundamentals of Computer Vision: Object Detection vs. Image Classification.
  • The evolution and architecture of YOLO (Why it's fast and accurate).
  • Setting up your development environment (Python, OpenCV, and Ultralytics library).
  • Running your very first YOLO inference on a sample image.

Module 2: Mastering Pre-Trained Models

  • Understanding the output: Bounding boxes, class labels, and confidence scores.
  • Processing different media types: Images, Videos, and Live Webcams.
  • Customizing the output visualization using OpenCV.
  • Mini-Project: Creating a real-time face and person detector using a webcam feed.

Module 3: Data Collection and Annotation

  • The importance of high-quality data in Machine Learning.
  • Sourcing public datasets (Kaggle, Roboflow Universe).
  • Best practices for image annotation using tools like LabelImg or Roboflow.
  • Understanding the YOLO dataset format (images, labels, and the yaml configuration).

Module 4: Training a Custom YOLO Model

  • Preparing your dataset for training (Train, Valid, Test splits).
  • Configuring training hyperparameters (Epochs, Batch Size, Image Size).
  • Training the model using cloud GPUs (Google Colab).
  • Reading and interpreting training metrics (Loss graphs and mean Average Precision/mAP).

Module 5: Advanced Object Tracking

  • The limitations of standard Object Detection in video streams.
  • Introduction to multi-object tracking algorithms (ByteTrack and BoT-SORT).
  • Assigning and maintaining unique IDs for detected objects across multiple frames.
  • Handling occlusion (when objects are temporarily hidden).

Module 6: Combining AI with Spatial Logic (OpenCV)

  • Defining Regions of Interest (ROI) and creating masking zones.
  • Drawing virtual boundaries and lines on video frames.
  • Mathematical logic for computer vision: Calculating the center points of bounding boxes.
  • Determining the direction of a moving object based on coordinate changes.

Module 7: Model Export and Optimization

  • Best practices for saving and loading your custom weights (best.pt).
  • Exporting YOLO models to different formats (ONNX, TensorRT, or TensorFlow Lite) for faster inference.
  • Tips for optimizing processing speed on standard CPUs.

Module 8: Capstone Project Assembly

  • Integrating custom detection models with tracking algorithms.
  • Merging spatial logic (from Module 6) with the AI tracking pipeline.
  • Troubleshooting common integration issues.
  • Finalizing the codebase and exporting the final project output.

Capstone Project

Featured Project

Smart Traffic Monitoring and Vehicle Counting System

As the final project, you will build a comprehensive, real-time traffic monitoring application. Starting with raw CCTV highway footage, you will deploy a YOLO model to identify different types of vehicles. You will then implement a tracking algorithm to follow each vehicle's trajectory and apply spatial logic to count them as they cross a customized "virtual tripwire" drawn on the screen. This project mimics real-world smart city applications and serves as a powerful portfolio piece.

Core Project Goal

Apply all the skills you've learned to build a production-ready application from scratch. This project serves as your portfolio piece.

Key Features:

  • Multi-Class Detection: Accurately identifies and categorizes cars, trucks, buses, and motorcycles in a busy environment.
  • Persistent Object Tracking: Uses advanced tracking (like ByteTrack) to assign a unique ID to every vehicle, ensuring no vehicle is counted twice, even if it temporarily stops or is partially obscured.
  • Directional Line Crossing Logic: Features a virtual boundary line built with OpenCV that tracks the trajectory of bounding box center-points to determine if a vehicle is traveling inbound or outbound.
  • Live Dashboard Overlay: Displays a real-time, aesthetically pleasing on-screen counter showing the total vehicle count, current FPS (Frames Per Second), and active bounding boxes with trailing paths.
Total Investment
Rp 7,000,000
One-time payment
Duration: 7 Weeks
Mode: Online
Certificate of Completion