Airflow structure/organization of Dags and tasks

I use something like this. A project is normally something completely separate or unique. Perhaps DAGs to process files that we receive from a certain client which will be completely unrelated to everything else (almost certainly a separate database schema) I have my operators, hooks, and some helper scripts (delete all Airflow data for a … Read more

How to limit Airflow to run only one instance of a DAG run at a time?

You’ve put the ‘max_active_runs’: 1 into the default_args parameter and not into the correct spot. max_active_runs is a constructor argument for a DAG and should not be put into the default_args dictionary. Here is an example DAG that shows where you need to move it to: dag_args = { ‘owner’: ‘Owner’, # ‘max_active_runs’: 1, # … Read more

Can’t import Airflow plugins

After struggling with the Airflow documentation and trying some of the answers here without success, I found this approach from astronomer.io. As they point out, building an Airflow Plugin can be confusing and perhaps not the best way to add hooks and operators going forward. Custom hooks and operators are a powerful way to extend … Read more

Airflow backfill clarification

When you change the scheduler toggle to “on” for a DAG, the scheduler will trigger a backfill of all dag run instances for which it has no status recorded, starting with the start_date you specify in your “default_args”. For example: If the start date was “2017-01-21” and you turned on the scheduling toggle at “2017-01-22T00:00:00” … Read more

First time login to Apache Airflow asks for username and password, what is the username and password?

There is no default username and password created if you are just using python wheel. Run the following to create a user: For Airflow >=2.0.0: airflow users create –role Admin –username admin –email admin –firstname admin –lastname admin –password admin OR For Airflow <1.10.14: airflow create_user -r Admin -u admin -e admin@example.com -f admin -l … Read more

For Apache Airflow, How can I pass the parameters when manually trigger DAG via CLI?

You can pass parameters from the CLI using –conf ‘{“key”:”value”}’ and then use it in the DAG file as “{{ dag_run.conf[“key”] }}” in templated field. CLI: airflow trigger_dag ‘example_dag_conf’ -r ‘run_id’ –conf ‘{“message”:”value”}’ DAG File: args = { ‘start_date’: datetime.utcnow(), ‘owner’: ‘airflow’, } dag = DAG( dag_id=’example_dag_conf’, default_args=args, schedule_interval=None, ) def run_this_func(ds, **kwargs): print(“Remotely received … Read more

Removing Airflow task logs

Please refer https://github.com/teamclairvoyant/airflow-maintenance-dags This plugin has DAGs that can kill halted tasks and log-cleanups. You can grab the concepts and can come up with a new DAG that can cleanup as per your requirement.

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)