Python script scheduling in airflow

You should probably use the PythonOperator to call your function. If you want to define the function somewhere else, you can simply import it from a module as long as it’s accessible in your PYTHONPATH. from airflow import DAG from airflow.operators.python_operator import PythonOperator from my_script import my_python_function dag = DAG(‘tutorial’, default_args=default_args) PythonOperator(dag=dag, task_id=’my_task_powered_by_python’, provide_context=False, python_callable=my_python_function, … Read more

Running google colab every day at a specific time

You need to create a notebooks.csv listing all the Colaboratory URLs. Then use colabctl to run each notebook (In order, synchronously mentioned in the CSV) and then pauses for a period of n seconds of time before running them again. You can then run python colabctl.py <end-string> <sleep-seconds>, There’s a gCookies.pkl file in the repo. … Read more

Accessing configuration parameters passed to Airflow through CLI

This is probably a continuation of the answer provided by devj. At airflow.cfg the following property should be set to true: dag_run_conf_overrides_params=True While defining the PythonOperator, pass the following argument provide_context=True. For example: get_row_count_operator = PythonOperator(task_id=’get_row_count’, python_callable=do_work, dag=dag, provide_context=True) Define the python callable (Note the use of **kwargs): def do_work(**kwargs): table_name = kwargs[‘dag_run’].conf.get(‘table_name’) # Rest … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)