首页 > 数据库 >Debug : kfp.Client().upload_pipeline(): Failed to start a transaction to create a new pipeline and

Debug : kfp.Client().upload_pipeline(): Failed to start a transaction to create a new pipeline and

时间:2024-02-14 17:33:05浏览次数:27  
标签:pipeline transaction wafer kfp upload server api new

[ERROR: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on 10.96.0.10:53: no such host","]

>>> kfp.Client().upload_pipeline("/home/maye/pipeline_wafer_distribute.yaml", "pipeline_wafer_ps_worker_mount_pv", "wafer pipeline with distributed training,parameter server srategy.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp/_client.py", line 1232, in upload_pipeline
    response = self._upload_api.upload_pipeline(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 69, in upload_pipeline
    return self.upload_pipeline_with_http_info(uploadfile, **kwargs)  # noqa: E501
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 163, in upload_pipeline_with_http_info
    return self.api_client.call_api(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api
    return self.__call_api(resource_path, method,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 188, in __call_api
    raise e
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api
    response_data = self.request(
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/api_client.py", line 407, in request
    return self.rest_client.POST(url,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/rest.py", line 265, in POST
    return self.request("POST", url,
  File "/home/maye/anaconda3/lib/python3.9/site-packages/kfp_server_api/rest.py", line 224, in request
    raise ApiException(http_resp=r)
kfp_server_api.exceptions.ApiException: (500)
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({'Audit-Id': '46804f38-da18-4357-927a-b78b4e8b7574', 'Cache-Control': 'no-cache, private', 'Content-Length': '561', 'Content-Type': 'text/plain; charset=utf-8', 'Date': 'Wed, 14 Feb 2024 09:09:22 GMT'})
HTTP response body: {"error_message":"Failed to create a pipeline and a pipeline version: Failed to create a pipeline and a pipeline version: InternalServerError: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on 10.96.0.10:53: no such host","error_details":"Failed to create a pipeline and a pipeline version: Failed to create a pipeline and a pipeline version: InternalServerError: Failed to start a transaction to create a new pipeline and a new pipeline version: dial tcp: lookup mysql on 10.96.0.10:53: no such host"}
>>> 

[ANALYSIS]

Error: 'dial tcp: lookup mysql on 10.96.0.10:53: no such host"', this error is saying that no host 'mysql' is found when looking up on DNS (domain name service) 10.96.0.10:53, which is coredns of the kubernetes cluster. Note that if resolving domain name (namely host name) timeout, this error will also be raised. And in this example, service mysql is running ok, and service mysql is on the same computer with kfp.Client().upload_pipeline(), so the very likely cause of the error is domain name resolving timeout, maybe coredns temporally has no enough resource to process the request of resolving the domain name, not connection timeout, since on one computer connection should be fast.

[SOLUTION]

retry. then ok:

>>> kfp.Client().upload_pipeline("/home/maye/pipeline_wafer_distribute.yaml", "pipeline_wafer_ps_worker_mount_pv", "wafer pipeline with distributed training,parameter server srategy.")
{'created_at': datetime.datetime(2024, 2, 14, 9, 31, 19, tzinfo=tzutc()),
 'default_version': {'code_source_url': None,
                     'created_at': datetime.datetime(2024, 2, 14, 9, 31, 19, tzinfo=tzutc()),
                     'description': 'wafer pipeline with distributed '
                                    'training,parameter server srategy.',
                     'id': '82508c98-3349-4d42-bdfd-3d131615e7ea',
                     'name': 'pipeline_wafer_ps_worker_mount_pv',
                     'package_url': None,
                     'parameters': [{'name': 'pipeline-root',
                                     'value': '/tfx/tfx_pv/pipelines/detect_anomolies_on_wafer_tfdv_schema'}],
                     'resource_references': [{'key': {'id': '7cbbd9e1-6657-4fe4-ac6c-e11beeaa0d4f',
                                                      'type': 'PIPELINE'},
                                              'name': None,
                                              'relationship': 'OWNER'}]},
 'description': 'wafer pipeline with distributed training,parameter server '
                'srategy.',
 'error': None,
 'id': '7cbbd9e1-6657-4fe4-ac6c-e11beeaa0d4f',
 'name': 'pipeline_wafer_ps_worker_mount_pv',
 'parameters': [{'name': 'pipeline-root',
                 'value': '/tfx/tfx_pv/pipelines/detect_anomolies_on_wafer_tfdv_schema'}],
 'resource_references': None,
 'url': None}
>>> 

标签:pipeline,transaction,wafer,kfp,upload,server,api,new
From: https://www.cnblogs.com/zhenxia-jiuyou/p/18015335

相关文章

  • 全基因组测序流程 | WGS pipeline
     创建conda环境,安装必要软件condacreate-nwgscondaactivatewgscondainstallbioconda::bwa 下载最佳reffastagcloudstoragecpgs://BUCKET_NAME/OBJECT_NAMESAVE_TO_LOCATIONgs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fastaR......
  • [965] Generate a new empty DataFrame with the same columns as an existing DataFr
    TogenerateanewemptyDataFramewiththesamecolumnsasanexistingDataFrameinPandas,youcanusethepd.DataFrameconstructorandpassthecolumnsfromtheexistingDataFrame.Here'sanexample:importpandasaspd#SampleDataFrameexisti......
  • C# 泛型方法New泛型对象
     var frm=Activator.CreateInstance<T>(); ///<summary>//////</summary>///<typeparamname="T"></typeparam>///<paramname="pnlFrm"></param>///......
  • CF1415E New Game Plus! 题解
    解题思路简单贪心题,我们可以把整个序列看作\(k+1\)个可重集,首先可以得到一个显然的结论:较大的数一定比较小的数先放入一个集合中。同样,由于每一轮\(ans\getsans+\maxsum_i\),其中\(sum_i\)表示第\(i\)个集合的元素和,那么,我们一定会将当前的元素放入当前和最大的哪个集合......
  • NTFS(New Technology File System)是Windows操作系统中使用的一种文件系统,它具有高级功
    NTFS(NewTechnologyFileSystem)是Windows操作系统中使用的一种文件系统,它具有高级功能和性能。NTFS文件系统的模型基于多个概念和组件,包括文件、目录、磁盘空间分配、访问控制等。下面是NTFS文件系统的技术原理和运作机制的简要介绍:文件和目录:NTFS使用树状结构组织文件和目录......
  • spring声明式事务(@Transactional)开发常犯的几个错误及解决办法
    目前JAVA的微服务项目基本都是SSM结构(即:springCloud+springMVC+Mybatis),而其中Mybatis事务的管理也是交由spring来管理,大部份都是使用声明式事务(@Transactional)来进行事务一致性的管理,然后在实际日常开发过程中,发现很多开发同学都用错了spring声明式事务(@Transactional)或者说使用......
  • 【阅读笔记】《A New Hardware-Efficient Algorithm and Reconfigurable Architecture
    一、对比度增强算法AGCWD硬件化实现2013年发表在TIP上的对比度增强算法AGCWD(Efficientcontrastenhancementusingadaptivegammacorrectionwithweightingdistribution)2014年发表在IEEETransactionsonImageProcessing的《ANewHardware-EfficientAlgorithmandReco......
  • Golang中make和new的区别
    1.相同点都是内建函数,都是在堆上分配内存,都需要传递类型参数2.不同点传递的参数不一样,new函数只接收一个参数,make函数可以接收一个以上的参数packagemainimport"fmt"funcmain(){ //int类型0值的指针,返回的值是以0x开头的16进制整数,参数个数为1 intZeroValueP......
  • CF620E New Year Tree
    CF620ENewYearTree题意:给出一棵n个节点的树,根节点为1。每个节点上有一种颜色ci​。m次操作。操作有两种:1uc:将以u为根的子树上的所有节点的颜色改为c。2u:询问以u为根的子树上的所有节点的颜色数量。1<=c<=60。由于c的范围,可以用一个整数来表示每棵子......
  • 如何给极狐GitLab 配置 webhook,自动触发 Pipeline?
    本文根据工作中的痛点来举例介绍如何使用极狐GitLab,让你的日常工作更高效。还在只使用极狐GitLab存放代码?那你就OUT啦。赶紧看看这篇文章,让你的日常工作更高效。使用GitlabWebhook触发Pipeline,打通工作消息通知关于A/B同学的问题,我想可以使用Webhook触发Pipe......