Pull to refresh

How to set up Apache Airflow for 10 minutes via Docker

Level of difficultyMedium
Reading time2 min
Views692

Prerequisites:
1. Install Docker *1
2. Install VSCode

STEP BY STEP

1. Open VSCode that you previously installed and click on "Extensions" tab right on the menu bar, then type 'docker' to find proper extension and click "install":

  1. Same way as first step was done you should find python extension and install it:

  1. Create directory to store content (f.e. /airflow_ais on picture below) and put there two files:
    docker-compose.yml
    dockerfile

  1. Copy following content correspondingly (building a custom Docker image, based on the official Apache Airflow Docker image):

    To docker-compose.yml:

version: '3'

services:
  ais_airflow:

    image: ais_airflow:latest
    volumes:
      - ./airflow:/opt/airflow

    ports:
      - "8080:8080"

    command: airflow standalone

To dockerfile (adding git for conviniance):

FROM apache/airflow:latest

USER root

RUN apt-get update && \
    apt-get -y install git && \
    apt-get clean

USER airflow 

Save files if needed (depends on strategy of autosaving you choose in settings). Its recommended to choose mode 'afterDelay' in order to avoid constant saving changes manually:

  1. Once you create these two configuration files right click on docker file and build image (bottom of context menu) so it will take some time:

Dockerfile provides instructions for building a Docker image, while docker-compose specifies, volumes, networking and other configurations for one or more Docker containers.

  1. Right-click on the 'docker-compose' file and choose 'Compose Up.'

Then you will see these messages in Terminal:

  1. Return to your Docker application and navigate to 'Containers' and find your running one:

Open it and choose 'View Details and click on given host (8080) as link:

Then you will see Airflow UI and then you can be logged in using credentials:
login: admin
password: <password located in file standalone_admin_password.txt> so you may keep/change it or put in .gitignore file:

Airflow has set up!

How to create your first dag - read upcoming articles! See you soon...

P.S.: Dont forget to compose down after work:

*1 Docker is an open software containerization platform designed for developing, shipping, and running applications. It packages the entire application along with its dependencies and configurations within a standardized unit known as a container. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. All containers are run by a single operating system kernel and therefore use fewer resources than a virtual machine.

Docker is a set of Platforms as a service (PaaS) products that use Operating system-level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries, and configuration files; they can communicate with each other through well-defined channels. 

Tags:
Hubs:
Total votes 3: ↑1 and ↓2+1
Comments0

Articles