Automation Workflow stable

Apache Airflow

Programmatic workflow orchestration for data pipelines

38.0K stars Since 2020
Website → GitHub

Programmatic workflow orchestration for data pipelines

License
Apache-2.0
Min RAM
2 GB
Min CPUs
2 cores
Scaling
single_node
Complexity
intermediate
Performance
medium
Self-hostable
K8s native
Offline
Pricing
fully free
Docs quality
good
Vendor lock-in
none

Use cases

  • Primary: data-pipeline-orchestration
  • Primary: etl-scheduling
  • Primary: ml-pipeline-management

Anti-patterns / when NOT to use

  • Not for real-time streaming
  • Complex setup and operations
  • DAG parsing can be slow
  • Not for event-driven workflows

Replaces / alternatives to

  • AWS Step Functions
  • Google Cloud Composer

Technical specs

Language
Python
API type
REST
Protocols
HTTP
Deployment
dockerpip

Community

GitHub stars 38.0K
Contributors 0
Commit frequency weekly
Plugin ecosystem none
Backing Apache Foundation
Funding foundation

Release

Latest version
Last release
Since 2020

Best fit

Team size
solosmallmediumenterprise
Industries
general

Tags

  • dag
  • scheduler
  • data-pipeline
  • etl
  • python
  • orchestration
  • sensors