Airflow with Celery Executor (MacOs)
Airflow - Celery Configuration |
On my first trials of the product I wanted to use Celery as worker machine. Airflow documentation describes the steps to do so, however I wanted to have more detailed instructions. In the following tutorial we are going to use as executor Celery. Celery will use RabbitMQ as Messaging Queue for the tasks.
(in brief instructions are located in https://airflow.apache.org/configuration.html?highlight=postgres#scaling-out-with-celery)
Install RabbitMQ
1. brew install rabbitmq
(If this fails to link installed file to /usr/local/sbin, the create dir and run brew link rabbitmq)
2. sudo nano ~/.bash_profile
3. Add to PATH :/usr/local/sbin
4. source ~/.bash_profile
5. sudo rabbitmq-server -detached
6. sudo rabbitmqctl add_user airflow airflow
7. Check status sudo rabbitmqctl status
Install Celery
1. pip install Celery==3.1.25 // version 4 seems to have issues
2. Check Celery installation: celery --version
2. CREATE DATABASE airflow;
Change the following lines:
executor = CeleryExecutor
broker_url = amqp://airflow:airflow@localhost:5672/myvhost
celery_result_backend = db+postgres+psycopg2://airflow:airflow@127.0.0.1:5432/airflow
sql_alchemy_conn = postgres+psycopg2://airflow:airflow@127.0.0.1:5432/airflow
2. Run worker with
airflow worker
Tip.
Start scheduler and webserver
airflow webserver
airflow scheduler
7. Check status sudo rabbitmqctl status
Install Celery
1. pip install Celery==3.1.25 // version 4 seems to have issues
2. Check Celery installation: celery --version
Install Postgres
1. brew install postgresql2. CREATE DATABASE airflow;
Create login user "airflow" with password "airflow"
4. pip install psycopg2
Run
5. airflow initdb
Config Airflow
1. nano ~/airflow/airflow.configChange the following lines:
executor = CeleryExecutor
broker_url = amqp://airflow:airflow@localhost:5672/myvhost
celery_result_backend = db+postgres+psycopg2://airflow:airflow@127.0.0.1:5432/airflow
sql_alchemy_conn = postgres+psycopg2://airflow:airflow@127.0.0.1:5432/airflow
2. Run worker with
airflow worker
Tip.
Start scheduler and webserver
airflow webserver
airflow scheduler
References:
1. https://stlong0521.github.io/20161023%20-%20Airflow.html
2. Setup on Ubuntu machines: https://medium.com/a-r-g-o/installing-apache-airflow-on-ubuntu-aws-6ebac15db211
Σχόλια