I am trying to modify a Google Sheets file and a CSV file in Google Drive via the Apache Airflow Google Provider:
df = pd.DataFrame(data)
csv_data = df.to_csv(index=True)
gcs_hook = GCSHook(gcp_conn_id=GOOGLE_CONNECTION)
gcs_hook.upload(
bucket_name=GOOGLE_CLOUD_BUCKET_NAME,
object_name=csv_name,
data=csv_data,
)
sheets_hook = GSheetsHook(gcp_conn_id=GOOGLE_CONNECTION)
sheets_hook.clear(GOOGLE_SHEET_ID, range_=sheet_name)
sheets_hook.append_values(
GOOGLE_SHEET_ID,
range_=sheet_name,
values=[df.columns.tolist()] + df.values.tolist(),
)
The string GOOGLE_CONNECTION
is just the name of the google
connection that I've define using the Apache Airflow GUI. This connection points to a credentials.json
file (the file exists and is discovered), which I obtained in the following way:
credentials.json
Editor
rights for the service account email via the Drive GUI using the Share
optionNevertheless, the operations are unsuccessful:
[2025-04-27, 10:09:26] ERROR - Task failed with exception: source="task"
HttpError:
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 825 in run
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 1088 in _execute_task
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/bases/operator.py", line 408 in wrapper
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/standard/operators/python.py", line 212 in execute
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/standard/operators/python.py", line 235 in execute_callable
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/callback_runner.py", line 81 in run
File "/opt/airflow/dags/ingest_and_save.py", line 139 in export_to_gsheet
File "/opt/airflow/dags/ingest_and_save.py", line 127 in write_table_to_Google
File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/google/suite/hooks/sheets.py", line 335 in clear
File "/home/airflow/.local/lib/python3.12/site-packages/googleapiclient/_helpers.py", line 130 in positional_wrapper
File "/home/airflow/.local/lib/python3.12/site-packages/googleapiclient/http.py", line 938 in execute
I searched for an option in the Google Cloud API to set the credentials scope, but couldn't find one. I tried manually appending the last two fields to the credentials.json
:
{
"type": "service_account",
"project_id": "",
"private_key_id": "",
"private_key": "",
"client_email": "",
"client_id": "",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "",
"universe_domain": "googleapis.com",
"extra__google_cloud_platform__scopes": ["https://www.googleapis.com/auth/spreadsheets", "https://www.googleapis.com/auth/drive"],
"scope": ["https://www.googleapis.com/auth/spreadsheets", "https://www.googleapis.com/auth/drive"]
}
However, the error is still the same. What am I doing wrong here?