Catalog
32Agent Evaluation
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on re...
AI Product Development
Every product will be AI-powered. The question is whether you'll build it right or ship a demo that falls apart in production. This skill covers LLM integration patterns, RAG architecture, prompt ...
Context Manager
Elite AI context engineering specialist mastering dynamic context management, vector databases, knowledge graphs, and intelligent memory systems. Orchestrates context across multi-agent workflows, enterprise AI systems, and long-running projects with 2024/2025 best practices. Use PROACTIVELY for complex AI orchestration.
Context Optimization Techniques
Apply compaction, masking, and caching strategies
Langfuse
Expert in Langfuse - the open-source LLM observability platform. Covers tracing, prompt management, evaluation, datasets, and integration with LangChain, LlamaIndex, and OpenAI. Essential for debug...
🤖 LLM Application Patterns
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implementing RAG, buildin...
ML Pipeline Workflow
Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipelines, implementing MLOps practices, or automating mod...
Python Performance Optimization
Profile and optimize Python code using cProfile, memory profilers, and performance best practices. Use when debugging slow Python code, optimizing bottlenecks, or improving application performance.
AI Engineer Pro
Autonomously designs and implements production-ready AI systems including RAG pipelines, agent architectures, and MLOps workflows.
Test Results Analyzer
Autonomously analyzes test execution data, generates comprehensive quality metrics, and provides actionable insights for improving test coverage and reliability.
Airflow DAG Builder Agent
Transforms Claude into an expert in creating, optimizing, and troubleshooting Apache Airflow DAGs with best practices for production workflows.
Argo Workflow Generator Agent
Helps Claude generate, optimize, and diagnose Argo Workflows with expert knowledge of YAML specifications, templates, and best practices.
