LeRobot Hackathons - VLA Models for SO-100

Hackathon projects using VLA and VLM models with SO-100 robot: shorts folding, kitchen tasks with Moondream, and plantain leaf cleaning

LeRobot Hackathons - VLA Models for SO-100

Overview

Hackathon projects using VLA and VLM models with SO-100 robot: shorts folding, kitchen tasks with Moondream, and plantain leaf cleaning

Project Overview

Participation in LeRobot hackathons exploring Vision-Language-Action (VLA) models and Vision-Language Models (VLM) for robotic manipulation with the SO-100 robot. Three progressive projects demonstrating cloth manipulation, kitchen tasks, and cultural task adaptation.

Hackathon Projects

Shorts Folding - VLA models for deformable object manipulation
Kitchen Tasks with Moondream - VLM-guided navigation, cleaning, and utensil manipulation
Plantain Leaf Cleaning - Cultural adaptation for Colombian culinary traditions

What are Vision-Language-Action Models?

VLA models combine vision, language understanding, and robotic control in a single end-to-end system:

Vision: Camera input for scene perception
Language: Natural language task instructions
Action: Direct robot control commands

This enables robots to understand tasks from human descriptions and execute them using visual feedback.

Hackathon 1: Shorts Folding

Challenge: Fold shorts using VLA models - a classic deformable object manipulation task.

Approach:

Imitation learning from teleoperated demonstrations
Vision encoder + language instruction encoder → action policy
Trained on SO-100 robot kinematics

Key Skills: Cloth state perception, grasp point detection, sequential folding actions

Hackathon 2: Kitchen Manipulation with Moondream

Moondream VLM is a lightweight vision-language model providing scene understanding and spatial reasoning.

Tasks Completed:

Navigate kitchen using natural language: “Go to the sink”, “Move to the countertop”

Countertop Cleaning

Visual inspection → dirt detection → wiping trajectory → execution

Fork & Utensil Manipulation

Grasp forks from various orientations
Precise placement and organization
Sorting by type

Table Cleaning

Object detection → removal planning → surface wiping → verification

Hackathon 3: Plantain Leaf Cleaning

Cultural Context

In Colombian cuisine, plantain leaves wrap tamales and traditional dishes. They must be cleaned carefully before use.

Challenge

Adapt learned manipulation skills to clean large, flexible plantain leaves without tearing them.

Approach:

Gentle edge grasping
Adaptive force control
VLM-guided inspection: “Is this area clean?”, “Where is the dirt?”

Impact: Demonstrates how robotic systems can adapt to culturally-specific tasks.

Technical Stack

Frameworks:

LeRobot - Robot learning framework
PyTorch - Deep learning
Moondream - Vision-language model

Robot: SO-100 (6 DOF + gripper, ~500mm reach)

Training: Behavior cloning from human demonstrations

Key Learnings

✅ VLAs simplify pipelines - Direct vision-to-action mapping
✅ Language enables intuition - Natural task specification
✅ Pre-training is powerful - Vision-language models provide strong priors
✅ Cultural relevance matters - Robotics should serve diverse communities

Resources

LeRobot: huggingface.co/lerobot
Moondream: huggingface.co/vikhyatk/moondream2
SO-100 Robot: Open-source manipulation platform

Keywords

LeRobot Vision-Language-Action VLA Moondream SO-100 Robotic Manipulation Imitation Learning Kitchen Robotics Cultural Robotics Deep Learning PyTorch

Vision, language, and action for intelligent manipulation 🤖✨

Robotics Machine Learning VLA Vision-Language Models LeRobot

LeRobot hackathon overview - VLA models in action

SO-100 robot manipulation demonstration

Kitchen manipulation tasks with Moondream VLM

Plantain leaf cleaning - cultural task adaptation

Full demonstration video

LeRobot Hackathons - VLA Models for SO-100

LeRobot Hackathons - VLA Models for SO-100

Overview

Project Overview

Hackathon Projects

What are Vision-Language-Action Models?

Hackathon 1: Shorts Folding

Hackathon 2: Kitchen Manipulation with Moondream

Tasks Completed:

Navigation

Countertop Cleaning

Fork & Utensil Manipulation

Table Cleaning

Hackathon 3: Plantain Leaf Cleaning

Cultural Context

Challenge

Technical Stack

Key Learnings

Resources

Keywords