example-ml-training-pipeline-train-model-h0heesi0 ▶ Log message source details [2025-09-04, 13:41:54 UTC] {local_task_job_runner.py:123} ▶ Pre task execution logs [2025-09-04, 13:41:54 UTC] {crypto.py:82} WARNING - empty cryptography key - values will not be stored encrypted. [2025-09-04, 13:41:54 UTC] {pod.py:1276} INFO - Building pod train-model-u726z1o with labels: {'dag_id': 'example_ml_training_pipeline', 'task_id': 'train_model', 'run_id': 'manual__2025-09-04T134148.5134530000-ac5d4c737', 'kubernetes_pod_operator': 'True', 'try_number': '1'} [2025-09-04, 13:41:54 UTC] {pod.py:573} INFO - Found matching pod train-model-u726z1o with labels {'airflow_kpo_in_cluster': 'True', 'airflow_version': '2.10.5', 'app': 'airflow', 'component': 'task-pod', 'dag_id': 'example_ml_training_pipeline', 'kubernetes_pod_operator': 'True', 'release': 'dev-kevinbazira', 'routed_via': 'dev-kevinbazira', 'run_id': 'manual__2025-09-04T134148.5134530000-ac5d4c737', 'task_id': 'train_model', 'try_number': '1'} [2025-09-04, 13:41:54 UTC] {pod.py:574} INFO - `try_number` of task_instance: 1 [2025-09-04, 13:41:54 UTC] {pod.py:575} INFO - `try_number` of pod: 1 [2025-09-04, 13:41:55 UTC] {pod_manager.py:390} INFO - The Pod has an Event: Successfully assigned airflow-dev/train-model-u726z1o to dse-k8s-worker1017.eqiad.wmnet from None [2025-09-04, 13:41:55 UTC] {pod_manager.py:410} ▶ Waiting until 120s to get the POD scheduled... [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] total 28 [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] drwxrwsr-x 2 runuser runuser 4096 Sep 4 09:37 example_etl_output.parquet [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] drwxrws--- 2 root runuser 16384 Aug 28 12:59 lost+found [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] -rw-rw-r-- 1 runuser runuser 85 Sep 4 12:25 model.pkl [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] -rw-rw-r-- 1 runuser runuser 0 Sep 4 09:37 test_write.txt [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] drwxrwsr-x 3 runuser runuser 4096 Aug 28 13:00 training [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] Traceback (most recent call last): [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] File "/srv/example/training/example/train/src/example/train_model.py", line 40, in [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] ROCm (AMD GPU) is available. [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] ROCm device count: 0 [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] main() [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] File "/srv/example/training/example/train/src/example/train_model.py", line 20, in main [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] print(f"ROCm device name: {torch.cuda.get_device_name(0)}") [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 493, in get_device_name [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] return get_device_properties(device).name [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 523, in get_device_properties [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] _lazy_init() # will define _get_device_properties [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^ [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init [2025-09-04, 13:42:25 UTC] {pod_manager.py:536} INFO - [base] torch._C._cuda_init() [2025-09-04, 13:42:26 UTC] {pod_manager.py:555} INFO - [base] RuntimeError: No HIP GPUs are available [2025-09-04, 13:42:26 UTC] {pod_manager.py:582} WARNING - Pod train-model-u726z1o log read interrupted but container base still running. Logs generated in the last one second might get duplicated. [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] Traceback (most recent call last): [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] File "/srv/example/training/example/train/src/example/train_model.py", line 40, in [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] ROCm (AMD GPU) is available. [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] ROCm device count: 0 [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] main() [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] File "/srv/example/training/example/train/src/example/train_model.py", line 20, in main [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] print(f"ROCm device name: {torch.cuda.get_device_name(0)}") [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 493, in get_device_name [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] return get_device_properties(device).name [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 523, in get_device_properties [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] _lazy_init() # will define _get_device_properties [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] ^^^^^^^^^^^^ [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] File "/opt/lib/python/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init [2025-09-04, 13:42:27 UTC] {pod_manager.py:536} INFO - [base] torch._C._cuda_init() [2025-09-04, 13:42:27 UTC] {pod_manager.py:555} INFO - [base] RuntimeError: No HIP GPUs are available [2025-09-04, 13:42:28 UTC] {pod_manager.py:714} INFO - Pod train-model-u726z1o has phase Running [2025-09-04, 13:42:30 UTC] {pod.py:1122} INFO - Deleting pod: train-model-u726z1o [2025-09-04, 13:42:30 UTC] {taskinstance.py:3313} ERROR - Task failed with exception Traceback (most recent call last): File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 768, in _execute_task result = _execute_callable(context=context, **execute_callable_kwargs) File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 734, in _execute_callable return ExecutionCallableRunner( File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/utils/operator_helpers.py", line 252, in run return self.func(*args, **kwargs) File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 424, in wrapper return func(self, *args, **kwargs) File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 640, in execute return self.execute_sync(context) File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 721, in execute_sync self.cleanup( File "/tmp/pyenv/versions/3.10.15/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 1053, in cleanup raise AirflowException( airflow.exceptions.AirflowException: Pod train-model-u726z1o returned a failure. remote_pod: {'api_version': 'v1', 'kind': 'Pod', 'metadata': {'annotations': {'cni.projectcalico.org/containerID': '511bb84917686605961322ad194937cd6d0807ecd75c88b651aa03bbabc5be92', 'cni.projectcalico.org/podIP': '', 'cni.projectcalico.org/podIPs': '', 'container.seccomp.security.alpha.kubernetes.io/base': 'runtime/default'}, 'creation_timestamp': datetime.datetime(2025, 9, 4, 13, 41, 54, tzinfo=tzlocal()), 'deletion_grace_period_seconds': None, 'deletion_timestamp': None, 'finalizers': None, 'generate_name': None, 'generation': None, 'labels': {'airflow_kpo_in_cluster': 'True', 'airflow_version': '2.10.5', 'app': 'airflow', 'component': 'task-pod', 'dag_id': 'example_ml_training_pipeline', 'kubernetes_pod_operator': 'True', 'release': 'dev-kevinbazira', 'routed_via': 'dev-kevinbazira', 'run_id': 'manual__2025-09-04T134148.5134530000-ac5d4c737', 'task_id': 'train_model', 'try_number': '1'}, 'managed_fields': [{'api_version': 'v1', 'fields_type': 'FieldsV1', 'fields_v1': {'f:metadata': {'f:labels': {'.': {}, 'f:airflow_kpo_in_cluster': {}, 'f:airflow_version': {}, 'f:app': {}, 'f:component': {}, 'f:dag_id': {}, 'f:kubernetes_pod_operator': {}, 'f:release': {}, 'f:routed_via': {}, 'f:run_id': {}, 'f:task_id': {}, 'f:try_number': {}}}, 'f:spec': {'f:affinity': {}, 'f:containers': {'k:{"name":"base"}': {'.': {}, 'f:args': {}, 'f:command': {}, 'f:env': {'.': {}, 'k:{"name":"AWS_REQUEST_CHECKSUM_CALCULATION"}': {'.': {}, 'f:name': {}, 'f:value': {}}, 'k:{"name":"AWS_RESPONSE_CHECKSUM_VALIDATION"}': {'.': {}, 'f:name': {}, 'f:value': {}}, 'k:{"name":"REQUESTS_CA_BUNDLE"}': {'.': {}, 'f:name': {}, 'f:value': {}}}, 'f:image': {}, 'f:imagePullPolicy': {}, 'f:name': {}, 'f:resources': {'.': {}, 'f:limits': {'.': {}, 'f:cpu': {}, 'f:memory': {}}, 'f:requests': {'.': {}, 'f:cpu': {}, 'f:memory': {}}}, 'f:securityContext': {'.': {}, 'f:allowPrivilegeEscalation': {}, 'f:capabilities': {'.': {}, 'f:drop': {}}, 'f:runAsNonRoot': {}, 'f:seccompProfile': {'.': {}, 'f:type': {}}}, 'f:terminationMessagePath': {}, 'f:terminationMessagePolicy': {}, 'f:volumeMounts': {'.': {}, 'k:{"mountPath":"/mnt/model-training"}': {'.': {}, 'f:mountPath': {}, 'f:name': {}}}}}, 'f:dnsPolicy': {}, 'f:enableServiceLinks': {}, 'f:priorityClassName': {}, 'f:restartPolicy': {}, 'f:schedulerName': {}, 'f:securityContext': {}, 'f:terminationGracePeriodSeconds': {}, 'f:volumes': {'.': {}, 'k:{"name":"airflow-ml-model-training-volume"}': {'.': {}, 'f:name': {}, 'f:persistentVolumeClaim': {'.': {}, 'f:claimName': {}}}}}}, 'manager': 'OpenAPI-Generator', 'operation': 'Update', 'subresource': None, 'time': datetime.datetime(2025, 9, 4, 13, 41, 54, tzinfo=tzlocal())}, {'api_version': 'v1', 'fields_type': 'FieldsV1', 'fields_v1': {'f:metadata': {'f:annotations': {'f:cni.projectcalico.org/containerID': {}, 'f:cni.projectcalico.org/podIP': {}, 'f:cni.projectcalico.org/podIPs': {}}}}, 'manager': 'Go-http-client', 'operation': 'Update', 'subresource': 'status', 'time': datetime.datetime(2025, 9, 4, 13, 42, 3, tzinfo=tzlocal())}, {'api_version': 'v1', 'fields_type': 'FieldsV1', 'fields_v1': {'f:status': {'f:conditions': {'k:{"type":"ContainersReady"}': {'.': {}, 'f:lastProbeTime': {}, 'f:lastTransitionTime': {}, 'f:reason': {}, 'f:status': {}, 'f:type': {}}, 'k:{"type":"Initialized"}': {'.': {}, 'f:lastProbeTime': {}, 'f:lastTransitionTime': {}, 'f:status': {}, 'f:type': {}}, 'k:{"type":"Ready"}': {'.': {}, 'f:lastProbeTime': {}, 'f:lastTransitionTime': {}, 'f:reason': {}, 'f:status': {}, 'f:type': {}}}, 'f:containerStatuses': {}, 'f:hostIP': {}, 'f:phase': {}, 'f:podIP': {}, 'f:podIPs': {'.': {}, 'k:{"ip":"10.67.30.60"}': {'.': {}, 'f:ip': {}}, 'k:{"ip":"2620:0:861:302:677:6c3a:553b:5fc"}': {'.': {}, 'f:ip': {}}}, 'f:startTime': {}}}, 'manager': 'kubelet', 'operation': 'Update', 'subresource': 'status', 'time': datetime.datetime(2025, 9, 4, 13, 42, 27, tzinfo=tzlocal())}], 'name': 'train-model-u726z1o', 'namespace': 'airflow-dev', 'owner_references': None, 'resource_version': '745183545', 'self_link': None, 'uid': '6a3ae65b-7f38-4a48-b9ac-fb419e847320'}, 'spec': {'active_deadline_seconds': None, 'affinity': {'node_affinity': None, 'pod_affinity': None, 'pod_anti_affinity': None}, 'automount_service_account_token': None, 'containers': [{'args': ['ls -l /mnt/model-training && python3 ' 'training/example/train/src/example/train_model.py ' '--input-path ' '/mnt/model-training/example_etl_output.parquet ' '--output-model-path ' '/mnt/model-training/model.pkl && ls -l ' '/mnt/model-training'], 'command': ['bash', '-c'], 'env': [{'name': 'REQUESTS_CA_BUNDLE', 'value': '/etc/ssl/certs/ca-certificates.crt', 'value_from': None}, {'name': 'AWS_REQUEST_CHECKSUM_CALCULATION', 'value': 'WHEN_REQUIRED', 'value_from': None}, {'name': 'AWS_RESPONSE_CHECKSUM_VALIDATION', 'value': 'WHEN_REQUIRED', 'value_from': None}], 'env_from': None, 'image': 'docker-registry.discovery.wmnet/repos/machine-learning/ml-pipelines:job-605476', 'image_pull_policy': 'IfNotPresent', 'lifecycle': None, 'liveness_probe': None, 'name': 'base', 'ports': None, 'readiness_probe': None, 'resize_policy': None, 'resources': {'claims': None, 'limits': {'cpu': '2', 'memory': '3Gi'}, 'requests': {'cpu': '1', 'memory': '1500Mi'}}, 'restart_policy': None, 'security_context': {'allow_privilege_escalation': False, 'app_armor_profile': None, 'capabilities': {'add': None, 'drop': ['ALL']}, 'privileged': None, 'proc_mount': None, 'read_only_root_filesystem': None, 'run_as_group': None, 'run_as_non_root': True, 'run_as_user': None, 'se_linux_options': None, 'seccomp_profile': {'localhost_profile': None, 'type': 'RuntimeDefault'}, 'windows_options': None}, 'startup_probe': None, 'stdin': None, 'stdin_once': None, 'termination_message_path': '/dev/termination-log', 'termination_message_policy': 'File', 'tty': None, 'volume_devices': None, 'volume_mounts': [{'mount_path': '/mnt/model-training', 'mount_propagation': None, 'name': 'airflow-ml-model-training-volume', 'read_only': None, 'recursive_read_only': None, 'sub_path': None, 'sub_path_expr': None}, {'mount_path': '/var/run/secrets/kubernetes.io/serviceaccount', 'mount_propagation': None, 'name': 'kube-api-access-t2jw5', 'read_only': True, 'recursive_read_only': None, 'sub_path': None, 'sub_path_expr': None}], 'working_dir': None}], 'dns_config': None, 'dns_policy': 'ClusterFirst', 'enable_service_links': True, 'ephemeral_containers': None, 'host_aliases': None, 'host_ipc': None, 'host_network': None, 'host_pid': None, 'host_users': None, 'hostname': None, 'image_pull_secrets': None, 'init_containers': None, 'node_name': 'dse-k8s-worker1017.eqiad.wmnet', 'node_selector': None, 'os': None, 'overhead': None, 'preemption_policy': 'PreemptLowerPriority', 'priority': -100, 'priority_class_name': 'low-priority-pod', 'readiness_gates': None, 'resource_claims': None, 'resources': None, 'restart_policy': 'Never', 'runtime_class_name': None, 'scheduler_name': 'default-scheduler', 'scheduling_gates': None, 'security_context': {'app_armor_profile': None, 'fs_group': None, 'fs_group_change_policy': None, 'run_as_group': None, 'run_as_non_root': None, 'run_as_user': None, 'se_linux_change_policy': None, 'se_linux_options': None, 'seccomp_profile': None, 'supplemental_groups': None, 'supplemental_groups_policy': None, 'sysctls': None, 'windows_options': None}, 'service_account': 'default', 'service_account_name': 'default', 'set_hostname_as_fqdn': None, 'share_process_namespace': None, 'subdomain': None, 'termination_grace_period_seconds': 30, 'tolerations': [{'effect': 'NoExecute', 'key': 'node.kubernetes.io/not-ready', 'operator': 'Exists', 'toleration_seconds': 300, 'value': None}, {'effect': 'NoExecute', 'key': 'node.kubernetes.io/unreachable', 'operator': 'Exists', 'toleration_seconds': 300, 'value': None}], 'topology_spread_constraints': None, 'volumes': [{'aws_elastic_block_store': None, 'azure_disk': None, 'azure_file': None, 'cephfs': None, 'cinder': None, 'config_map': None, 'csi': None, 'downward_api': None, 'empty_dir': None, 'ephemeral': None, 'fc': None, 'flex_volume': None, 'flocker': None, 'gce_persistent_disk': None, 'git_repo': None, 'glusterfs': None, 'host_path': None, 'image': None, 'iscsi': None, 'name': 'airflow-ml-model-training-volume', 'nfs': None, 'persistent_volume_claim': {'claim_name': 'airflow-ml-model-training', 'read_only': None}, 'photon_persistent_disk': None, 'portworx_volume': None, 'projected': None, 'quobyte': None, 'rbd': None, 'scale_io': None, 'secret': None, 'storageos': None, 'vsphere_volume': None}, {'aws_elastic_block_store': None, 'azure_disk': None, 'azure_file': None, 'cephfs': None, 'cinder': None, 'config_map': None, 'csi': None, 'downward_api': None, 'empty_dir': None, 'ephemeral': None, 'fc': None, 'flex_volume': None, 'flocker': None, 'gce_persistent_disk': None, 'git_repo': None, 'glusterfs': None, 'host_path': None, 'image': None, 'iscsi': None, 'name': 'kube-api-access-t2jw5', 'nfs': None, 'persistent_volume_claim': None, 'photon_persistent_disk': None, 'portworx_volume': None, 'projected': {'default_mode': 420, 'sources': [{'cluster_trust_bundle': None, 'config_map': None, 'downward_api': None, 'secret': None, 'service_account_token': {'audience': None, 'expiration_seconds': 3607, 'path': 'token'}}, {'cluster_trust_bundle': None, 'config_map': {'items': [{'key': 'ca.crt', 'mode': None, 'path': 'ca.crt'}], 'name': 'kube-root-ca.crt', 'optional': None}, 'downward_api': None, 'secret': None, 'service_account_token': None}, {'cluster_trust_bundle': None, 'config_map': None, 'downward_api': {'items': [{'field_ref': {'api_version': 'v1', 'field_path': 'metadata.namespace'}, 'mode': None, 'path': 'namespace', 'resource_field_ref': None}]}, 'secret': None, 'service_account_token': None}]}, 'quobyte': None, 'rbd': None, 'scale_io': None, 'secret': None, 'storageos': None, 'vsphere_volume': None}]}, 'status': {'conditions': [{'last_probe_time': None, 'last_transition_time': datetime.datetime(2025, 9, 4, 13, 41, 54, tzinfo=tzlocal()), 'message': None, 'reason': None, 'status': 'True', 'type': 'Initialized'}, {'last_probe_time': None, 'last_transition_time': datetime.datetime(2025, 9, 4, 13, 42, 27, tzinfo=tzlocal()), 'message': None, 'reason': 'PodFailed', 'status': 'False', 'type': 'Ready'}, {'last_probe_time': None, 'last_transition_time': datetime.datetime(2025, 9, 4, 13, 42, 27, tzinfo=tzlocal()), 'message': None, 'reason': 'PodFailed', 'status': 'False', 'type': 'ContainersReady'}, {'last_probe_time': None, 'last_transition_time': datetime.datetime(2025, 9, 4, 13, 41, 54, tzinfo=tzlocal()), 'message': None, 'reason': None, 'status': 'True', 'type': 'PodScheduled'}], 'container_statuses': [{'allocated_resources': None, 'allocated_resources_status': None, 'container_id': 'containerd://cc4317edd3504105a422701df2822c10ba9b96c335e929b87af20bedfdb782b8', 'image': 'docker-registry.discovery.wmnet/repos/machine-learning/ml-pipelines:job-605476', 'image_id': 'docker-registry.discovery.wmnet/repos/machine-learning/ml-pipelines@sha256:65f31f2bf9cf539521a2a67dec6352dcf59749eeaeec6918c41db6efa595250a', 'last_state': {'running': None, 'terminated': None, 'waiting': None}, 'name': 'base', 'ready': False, 'resources': None, 'restart_count': 0, 'started': False, 'state': {'running': None, 'terminated': {'container_id': 'containerd://cc4317edd3504105a422701df2822c10ba9b96c335e929b87af20bedfdb782b8', 'exit_code': 1, 'finished_at': datetime.datetime(2025, 9, 4, 13, 42, 26, tzinfo=tzlocal()), 'message': None, 'reason': 'Error', 'signal': None, 'started_at': datetime.datetime(2025, 9, 4, 13, 42, 18, tzinfo=tzlocal())}, 'waiting': None}, 'user': None, 'volume_mounts': None}], 'ephemeral_container_statuses': None, 'host_i_ps': None, 'host_ip': '10.64.16.210', 'init_container_statuses': None, 'message': None, 'nominated_node_name': None, 'phase': 'Failed', 'pod_i_ps': [{'ip': '10.67.30.60'}, {'ip': '2620:0:861:302:677:6c3a:553b:5fc'}], 'pod_ip': '10.67.30.60', 'qos_class': 'Burstable', 'reason': None, 'resize': None, 'resource_claim_statuses': None, 'start_time': datetime.datetime(2025, 9, 4, 13, 41, 54, tzinfo=tzlocal())}} [2025-09-04, 13:42:30 UTC] {taskinstance.py:1226} INFO - Marking task as UP_FOR_RETRY. dag_id=example_ml_training_pipeline, task_id=train_model, run_id=manual__2025-09-04T13:41:48.513453+00:00, execution_date=20250904T134148, start_date=20250904T134154, end_date=20250904T134230 [2025-09-04, 13:42:30 UTC] {taskinstance.py:341} ▶ Post task execution logs