AIDE: ML CodeGen Agent
2025 — LLM Agent for Automated Machine Learning
About
AIDE is an LLM agent that generates solutions for machine learning tasks from natural language descriptions. In a benchmark of 60+ Kaggle competitions, AIDE surpassed 50% of Kaggle participants on average. The agent iteratively runs, debugs, evaluates, and improves ML code autonomously.
View on GitHub →Key Features
🗣️ Natural Language Input
Describe your problem, requirements, and expert insights in plain language
📜 Source Code Output
Get tested Python scripts — full transparency and reproducibility
🔄 Iterative Optimization
Runs, debugs, evaluates, and improves the ML code automatically
🌳 Tree Visualization
Visualize the solution tree to understand what works and what doesn't
Solution Space Tree Search
AIDE's approach mirrors how human data scientists work — generate initial drafts, then iteratively refine based on feedback.
-
1
Solution Generator
Creates new drafts or modifies existing solutions (bug fixes, improvements)
-
2
Evaluator
Runs solutions and extracts evaluation metrics from logs
-
3
Base Solution Selector
Picks the most promising solution as starting point for next iteration
Usage Example
aide data_dir="example_tasks/house_prices" \
goal="Predict the sales price for each house" \
eval="Use the RMSE metric between the logarithm of the predicted and observed values." \
agent.code.model=deepseek-r1:latest
Key Takeaways
- ✅ Beats 50% of Kaggle participants on average across 60+ competitions
- ✅ Tree search explores solution space systematically
- ✅ Supports multiple models — GPT-4, DeepSeek-R1, etc.
- ✅ Full transparency — outputs readable Python code