首页 > 数据库 >Debug : kfp.Client().upload_pipeline(): Failed to start a transaction to create a new pipeline and

Debug : kfp.Client().upload_pipeline(): Failed to start a transaction to create a new pipeline and

时间:2024-02-14 17:33:05浏览次数:27  
标签:pipeline transaction wafer kfp upload server api new

[ERROR: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on no such host","]

>>> kfp.Client().upload_pipeline("/home/maye/pipeline_wafer_distribute.yaml", "pipeline_wafer_ps_worker_mount_pv", "wafer pipeline with distributed training,parameter server srategy.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp/_client.py", line 1232, in upload_pipeline
    response = self._upload_api.upload_pipeline(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 69, in upload_pipeline
    return self.upload_pipeline_with_http_info(uploadfile, **kwargs)  # noqa: E501
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 163, in upload_pipeline_with_http_info
    return self.api_client.call_api(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api
    return self.__call_api(resource_path, method,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 188, in __call_api
    raise e
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api
    response_data = self.request(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 407, in request
    return self.rest_client.POST(url,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/rest.py", line 265, in POST
    return self.request("POST", url,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/rest.py", line 224, in request
    raise ApiException(http_resp=r)
kfp_server_api.exceptions.ApiException: (500)
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({'Audit-Id': '46804f38-da18-4357-927a-b78b4e8b7574', 'Cache-Control': 'no-cache, private', 'Content-Length': '561', 'Content-Type': 'text/plain; charset=utf-8', 'Date': 'Wed, 14 Feb 2024 09:09:22 GMT'})
HTTP response body: {"error_message":"Failed to create a pipeline and a pipeline version: Failed to create a pipeline and a pipeline version: InternalServerError: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on no such host","error_details":"Failed to create a pipeline and a pipeline version: Failed to create a pipeline and a pipeline version: InternalServerError: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on no such host"}


Error: 'dial tcp: lookup mysql on no such host"', this error is saying that no host 'mysql' is found when looking up on DNS (domain name service), which is coredns of the kubernetes cluster. Note that if resolving domain name (namely host name) timeout, this error will also be raised. And in this example, service mysql is running ok, and service mysql is on the same computer with kfp.Client().upload_pipeline(), so the very likely cause of the error is domain name resolving timeout, maybe coredns temporally has no enough resource to process the request of resolving the domain name, not connection timeout, since on one computer connection should be fast.


retry. then ok:

>>> kfp.Client().upload_pipeline("/home/maye/pipeline_wafer_distribute.yaml", "pipeline_wafer_ps_worker_mount_pv", "wafer pipeline with distributed training,parameter server srategy.")
{'created_at': datetime.datetime(2024, 2, 14, 9, 31, 19, tzinfo=tzutc()),
 'default_version': {'code_source_url': None,
                     'created_at': datetime.datetime(2024, 2, 14, 9, 31, 19, tzinfo=tzutc()),
                     'description': 'wafer pipeline with distributed '
                                    'training,parameter server srategy.',
                     'id': '82508c98-3349-4d42-bdfd-3d131615e7ea',
                     'name': 'pipeline_wafer_ps_worker_mount_pv',
                     'package_url': None,
                     'parameters': [{'name': 'pipeline-root',
                                     'value': '/tfx/tfx_pv/pipelines/detect_anomolies_on_wafer_tfdv_schema'}],
                     'resource_references': [{'key': {'id': '7cbbd9e1-6657-4fe4-ac6c-e11beeaa0d4f',
                                                      'type': 'PIPELINE'},
                                              'name': None,
                                              'relationship': 'OWNER'}]},
 'description': 'wafer pipeline with distributed training,parameter server '
 'error': None,
 'id': '7cbbd9e1-6657-4fe4-ac6c-e11beeaa0d4f',
 'name': 'pipeline_wafer_ps_worker_mount_pv',
 'parameters': [{'name': 'pipeline-root',
                 'value': '/tfx/tfx_pv/pipelines/detect_anomolies_on_wafer_tfdv_schema'}],
 'resource_references': None,
 'url': None}

From: https://www.cnblogs.com/zhenxia-jiuyou/p/18015335


  • 全基因组测序流程 | WGS pipeline
     创建conda环境,安装必要软件condacreate-nwgscondaactivatewgscondainstallbioconda::bwa 下载最佳reffastagcloudstoragecpgs://BUCKET_NAME/OBJECT_NAMESAVE_TO_LOCATIONgs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fastaR......
  • [965] Generate a new empty DataFrame with the same columns as an existing DataFr
  • C# 泛型方法New泛型对象
     var frm=Activator.CreateInstance<T>(); ///<summary>//////</summary>///<typeparamname="T"></typeparam>///<paramname="pnlFrm"></param>///......
  • CF1415E New Game Plus! 题解
  • NTFS(New Technology File System)是Windows操作系统中使用的一种文件系统,它具有高级功
  • spring声明式事务(@Transactional)开发常犯的几个错误及解决办法
  • 【阅读笔记】《A New Hardware-Efficient Algorithm and Reconfigurable Architecture
  • Golang中make和new的区别
    1.相同点都是内建函数,都是在堆上分配内存,都需要传递类型参数2.不同点传递的参数不一样,new函数只接收一个参数,make函数可以接收一个以上的参数packagemainimport"fmt"funcmain(){ //int类型0值的指针,返回的值是以0x开头的16进制整数,参数个数为1 intZeroValueP......
  • CF620E New Year Tree
  • 如何给极狐GitLab 配置 webhook,自动触发 Pipeline?