首页 > 其他分享 >Hadoop YARN - Introduction to the web services REST API’s

Hadoop YARN - Introduction to the web services REST API’s

时间:2023-06-04 13:06:51浏览次数:83  
标签:web name 1326821518301 10 Introduction Hadoop value application job


Hadoop YARN - Introduction to the web services REST API’s

Overview

The Hadoop YARN web service REST APIs are a set of URI resources that give access to the cluster, nodes, applications, and application historical information. The URI resources are grouped into APIs based on the type of information returned. Some URI resources return collections while others return singletons.



URI’s

The URIs for the REST-based Web services have the following syntax:



http://{http address of service}/ws/{version}/{resourcepath}



The elements in this syntax are as follows:



{http address of service} - The http address of the service to get information about. Currently supported are the ResourceManager, NodeManager, MapReduce application master, and history server. {version} - The version of the APIs. In this release, the version is v1. {resourcepath} - A path that defines a singleton resource or a collection of resources.


HTTP Requests

To invoke a REST API, your application calls an HTTP operation on the URI associated with a resource.



Summary of HTTP operations

Currently only GET is supported. It retrieves information about the resource specified.



Security

The web service REST API’s go through the same security as the web UI. If your cluster adminstrators have filters enabled you must authenticate via the mechanism they specified.



Headers Supported



* Accept * Accept-Encoding



Currently the only fields used in the header is Accept and Accept-Encoding. Accept currently supports XML and JSON for the response type you accept. Accept-Encoding



HTTP Responses

The next few sections describe some of the syntax and other details of the HTTP Responses of the web service REST APIs.



Compression

This release supports gzip compression if you specify gzip in the Accept-Encoding header of the HTTP request (Accept-Encoding: gzip).



Response Formats

This release of the web service REST APIs supports responses in JSON and XML formats. JSON is the default. To set the response format, you can specify the format in the Accept header of the HTTP request.

As specified in HTTP Response Codes, the response body can contain the data that represents the resource or an error message. In the case of success, the response body is in the selected format, either JSON or XML. In the case of error, the resonse body is in either JSON or XML based on the format requested. The Content-Type header of the response contains the format requested. If the application requests an unsupported format, the response status code is 500. Note that the order of the fields within response body is not specified and might change. Also, additional fields might be added to a response body. Therefore, your applications should use parsing routines that can extract data from a response body in any order.



Response Errors

After calling an HTTP request, an application should check the response status code to verify success or detect an error. If the response status code indicates an error, the response body contains an error message. The first field is the exception type, currently only RemoteException is returned. The following table lists the items within the RemoteException error message:

Item

Data Type

Description

exception

String

Exception type

javaClassName

String

Java class name of exception

message

String

Detailed message of exception



Response Examples



JSON response with single resource

HTTP Request: GET http://rmhost.domain:8088/ws/v1/cluster/app/application\_1324057493980\_0001

Response Status Line: HTTP/1.1 200 OK

Response Header:



HTTP/1.1 200 OK Content-Type: application/json Transfer-Encoding: chunked Server: Jetty(6.1.26)



Response Body:



{
  app":
  {
    "id":"application_1324057493980_0001",
    "user":"user1",
    "name":"",
    "queue":"default",
    "state":"ACCEPTED",
    "finalStatus":"UNDEFINED",
    "progress":0,
    "trackingUI":"UNASSIGNED",
    "diagnostics":"",
    "clusterId":1324057493980,
    "startedTime":1324057495921,
    "finishedTime":0,
    "elapsedTime":2063,
    "amContainerLogs":"http:\/\/amNM:2\/node\/containerlogs\/container_1324057493980_0001_01_000001",
    "amHostHttpAddress":"amNM:2"
  }
}



JSON response with Error response

Here we request information about an application that doesn’t exist yet.

HTTP Request: GET http://rmhost.domain:8088/ws/v1/cluster/app/application\_1324057493980\_9999

Response Status Line: HTTP/1.1 404 Not Found

Response Header:


HTTP/1.1 404 Not Found Content-Type: application/json Transfer-Encoding: chunked Server: Jetty(6.1.26)



Response Body:



{
   "RemoteException" : {
      "javaClassName" : "org.apache.hadoop.yarn.webapp.NotFoundException",
      "exception" : "NotFoundException",
      "message" : "java.lang.Exception: app with id: application_1324057493980_9999 not found"
   }
}



Sample Usage

You can use any number of ways/languages to use the web services REST API’s. This example uses the curl command line interface to do the REST GET calls.

In this example, a user submits a MapReduce application to the ResourceManager using a command like:



hadoop jar hadoop-mapreduce-test.jar sleep -Dmapred.job.queue.name=a1 -m 1 -r 1 -rt 1200000 -mt 20



The client prints information about the job submitted along with the application id, similar to:



12/01/18 04:25:15 INFO mapred.ResourceMgrDelegate: Submitted application application_1326821518301_0010 to ResourceManager at host.domain.com/10.10.10.10:8032
12/01/18 04:25:15 INFO mapreduce.Job: Running job: job_1326821518301_0010
12/01/18 04:25:21 INFO mapred.ClientServiceDelegate: The url to track the job: host.domain.com:8088/proxy/application_1326821518301_0010/
12/01/18 04:25:22 INFO mapreduce.Job: Job job_1326821518301_0010 running in uber mode : false
12/01/18 04:25:22 INFO mapreduce.Job:  map 0% reduce 0%



The user then wishes to track the application. The users starts by getting the information about the application from the ResourceManager. Use the –comopressed option to request output compressed. curl handles uncompressing on client side.



curl --compressed -H "Accept: application/json" -X GET "http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"



Output:



{
   "app" : {
      "finishedTime" : 0,
      "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",
      "trackingUI" : "ApplicationMaster",
      "state" : "RUNNING",
      "user" : "user1",
      "id" : "application_1326821518301_0010",
      "clusterId" : 1326821518301,
      "finalStatus" : "UNDEFINED",
      "amHostHttpAddress" : "host.domain.com:8042",
      "progress" : 82.44703,
      "name" : "Sleep job",
      "startedTime" : 1326860715335,
      "elapsedTime" : 31814,
      "diagnostics" : "",
      "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326821518301_0010/",
      "queue" : "a1"
   }
}



The user then wishes to get more details about the running application and goes directly to the MapReduce application master for this application. The ResourceManager lists the trackingUrl that can be used for this application: http://host.domain.com:8088/proxy/application\_1326821518301\_0010. This could either go to the web browser or use the web service REST API’s. The user uses the web services REST API’s to get the list of jobs this MapReduce application master is running:



curl --compressed -H "Accept: application/json" -X GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs"



Output:



{
   "jobs" : {
      "job" : [
         {
            "runningReduceAttempts" : 1,
            "reduceProgress" : 72.104515,
            "failedReduceAttempts" : 0,
            "newMapAttempts" : 0,
            "mapsRunning" : 0,
            "state" : "RUNNING",
            "successfulReduceAttempts" : 0,
            "reducesRunning" : 1,
            "acls" : [
               {
                  "value" : " ",
                  "name" : "mapreduce.job.acl-modify-job"
               },
               {
                  "value" : " ",
                  "name" : "mapreduce.job.acl-view-job"
               }
            ],
            "reducesPending" : 0,
            "user" : "user1",
            "reducesTotal" : 1,
            "mapsCompleted" : 1,
            "startTime" : 1326860720902,
            "id" : "job_1326821518301_10_10",
            "successfulMapAttempts" : 1,
            "runningMapAttempts" : 0,
            "newReduceAttempts" : 0,
            "name" : "Sleep job",
            "mapsPending" : 0,
            "elapsedTime" : 64432,
            "reducesCompleted" : 0,
            "mapProgress" : 100,
            "diagnostics" : "",
            "failedMapAttempts" : 0,
            "killedReduceAttempts" : 0,
            "mapsTotal" : 1,
            "uberized" : false,
            "killedMapAttempts" : 0,
            "finishTime" : 0
         }
      ]
   }
}



The user then wishes to get the task details about the job with job id job_1326821518301_10_10 that was listed above.


curl --compressed -H "Accept: application/json" -X GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks"



Output:



{
   "tasks" : {
      "task" : [
         {
            "progress" : 100,
            "elapsedTime" : 5059,
            "state" : "SUCCEEDED",
            "startTime" : 1326860725014,
            "id" : "task_1326821518301_10_10_m_0",
            "type" : "MAP",
            "successfulAttempt" : "attempt_1326821518301_10_10_m_0_0",
            "finishTime" : 1326860730073
         },
         {
            "progress" : 72.104515,
            "elapsedTime" : 0,
            "state" : "RUNNING",
            "startTime" : 1326860732984,
            "id" : "task_1326821518301_10_10_r_0",
            "type" : "REDUCE",
            "successfulAttempt" : "",
            "finishTime" : 0
         }
      ]
   }
}



The map task has finished but the reduce task is still running. The users wishes to get the task attempt information for the reduce task task_1326821518301_10_10_r_0, note that the Accept header isn’t really required here since JSON is the default output format:



curl --compressed -X GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts"



Output:



{
   "taskAttempts" : {
      "taskAttempt" : [
         {
            "elapsedMergeTime" : 158,
            "shuffleFinishTime" : 1326860735378,
            "assignedContainerId" : "container_1326821518301_0010_01_000003",
            "progress" : 72.104515,
            "elapsedTime" : 0,
            "state" : "RUNNING",
            "elapsedShuffleTime" : 2394,
            "mergeFinishTime" : 1326860735536,
            "rack" : "/10.10.10.0",
            "elapsedReduceTime" : 0,
            "nodeHttpAddress" : "host.domain.com:8042",
            "type" : "REDUCE",
            "startTime" : 1326860732984,
            "id" : "attempt_1326821518301_10_10_r_0_0",
            "finishTime" : 0
         }
      ]
   }
}



The reduce attempt is still running and the user wishes to see the current counter values for that attempt:



curl --compressed -H "Accept: application/json"  -X GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts/attempt_1326821518301_10_10_r_0_0/counters"



Output:



{
   "JobTaskAttemptCounters" : {
      "taskAttemptCounterGroup" : [
         {
            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
            "counter" : [
               {
                  "value" : 4216,
                  "name" : "FILE_BYTES_READ"
               }, 
               {
                  "value" : 77151,
                  "name" : "FILE_BYTES_WRITTEN"
               }, 
               {
                  "value" : 0,
                  "name" : "FILE_READ_OPS"
               },
               {
                  "value" : 0,
                  "name" : "FILE_LARGE_READ_OPS"
               },
               {
                  "value" : 0,
                  "name" : "FILE_WRITE_OPS"
               },
               {
                  "value" : 0,
                  "name" : "HDFS_BYTES_READ"
               },
               {
                  "value" : 0,
                  "name" : "HDFS_BYTES_WRITTEN"
               },
               {
                  "value" : 0,
                  "name" : "HDFS_READ_OPS"
               },
               {
                  "value" : 0,
                  "name" : "HDFS_LARGE_READ_OPS"
               },
               {
                  "value" : 0,
                  "name" : "HDFS_WRITE_OPS"
               }
            ]  
         }, 
         {
            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
            "counter" : [
               {
                  "value" : 0,
                  "name" : "COMBINE_INPUT_RECORDS"
               }, 
               {
                  "value" : 0,
                  "name" : "COMBINE_OUTPUT_RECORDS"
               }, 
               {  
                  "value" : 1767,
                  "name" : "REDUCE_INPUT_GROUPS"
               },
               {  
                  "value" : 25104,
                  "name" : "REDUCE_SHUFFLE_BYTES"
               },
               {
                  "value" : 1767,
                  "name" : "REDUCE_INPUT_RECORDS"
               },
               {
                  "value" : 0,
                  "name" : "REDUCE_OUTPUT_RECORDS"
               },
               {
                  "value" : 0,
                  "name" : "SPILLED_RECORDS"
               },
               {
                  "value" : 1,
                  "name" : "SHUFFLED_MAPS"
               },
               {
                  "value" : 0,
                  "name" : "FAILED_SHUFFLE"
               },
               {
                  "value" : 1,
                  "name" : "MERGED_MAP_OUTPUTS"
               },
               {
                  "value" : 50,
                  "name" : "GC_TIME_MILLIS"
               },
               {
                  "value" : 1580,
                  "name" : "CPU_MILLISECONDS"
               },
               {
                  "value" : 141320192,
                  "name" : "PHYSICAL_MEMORY_BYTES"
               },
              {
                  "value" : 1118552064,
                  "name" : "VIRTUAL_MEMORY_BYTES"
               }, 
               {  
                  "value" : 73728000,
                  "name" : "COMMITTED_HEAP_BYTES"
               }
            ]
         },
         {  
            "counterGroupName" : "Shuffle Errors",
            "counter" : [
               {  
                  "value" : 0,
                  "name" : "BAD_ID"
               },
               {  
                  "value" : 0,
                  "name" : "CONNECTION"
               },
               {  
                  "value" : 0,
                  "name" : "IO_ERROR"
               },
               {  
                  "value" : 0,
                  "name" : "WRONG_LENGTH"
               },
               {  
                  "value" : 0,
                  "name" : "WRONG_MAP"
               },
               {  
                  "value" : 0,
                  "name" : "WRONG_REDUCE"
               }
            ]
         },
         {  
            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
            "counter" : [
              {  
                  "value" : 0,
                  "name" : "BYTES_WRITTEN"
               }
            ]
         }
      ],
      "id" : "attempt_1326821518301_10_10_r_0_0"
   }
}



The job finishes and the user wishes to get the final job information from the history server for this job.



curl --compressed -X GET "http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10"



Output:



{
   "job" : {
      "avgReduceTime" : 1250784,
      "failedReduceAttempts" : 0,
      "state" : "SUCCEEDED",
      "successfulReduceAttempts" : 1,
      "acls" : [
         {
            "value" : " ",
            "name" : "mapreduce.job.acl-modify-job"
         },
         {
            "value" : " ",
            "name" : "mapreduce.job.acl-view-job"
         }
      ],
      "user" : "user1",
      "reducesTotal" : 1,
      "mapsCompleted" : 1,
      "startTime" : 1326860720902,
      "id" : "job_1326821518301_10_10",
      "avgMapTime" : 5059,
      "successfulMapAttempts" : 1,
      "name" : "Sleep job",
      "avgShuffleTime" : 2394,
      "reducesCompleted" : 1,
      "diagnostics" : "",
      "failedMapAttempts" : 0,
      "avgMergeTime" : 2552,
      "killedReduceAttempts" : 0,
      "mapsTotal" : 1,
      "queue" : "a1",
      "uberized" : false,
      "killedMapAttempts" : 0,
      "finishTime" : 1326861986164
   }
}



The user also gets the final applications information from the ResourceManager.



curl --compressed -H "Accept: application/json" -X GET "http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"

Output:



{
   "app" : {
      "finishedTime" : 1326861991282,
      "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",
      "trackingUI" : "History",
      "state" : "FINISHED",
      "user" : "user1",
      "id" : "application_1326821518301_0010",
      "clusterId" : 1326821518301,
      "finalStatus" : "SUCCEEDED",
      "amHostHttpAddress" : "host.domain.com:8042",
      "progress" : 100,
      "name" : "Sleep job",
      "startedTime" : 1326860715335,
      "elapsedTime" : 1275947,
      "diagnostics" : "",
      "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10",
      "queue" : "a1"
   }
}



标签:web,name,1326821518301,10,Introduction,Hadoop,value,application,job
From: https://blog.51cto.com/u_11860992/6410371

相关文章

  • 2018WEB安全测试秋季预选赛WriteUp
    0x01input传送门:http://114.55.36.69:8003/题目上说前三道题目是容易的,于是就从容易的题目入手,为了拿到1血,手速飞快地点,emmm,一紧张忘了js输出语句怎么写了,百度后才发现,自己有多蠢alert啊!进入网址,发现一个输入框,查看源码,发现id="flag",后面有一段js代码<script>functionchec......
  • webgpu_快速入门
    /Users/song/Downloads/WebGPU视频教程/1.WebGPU快速入门/9.三角形拼接矩形/2.三角形拼接矩形.html<!DOCTYPEhtml><htmllang="en"><head><metacharset="UTF-8"><metahttp-equiv="X-UA-Compatible"content="IE=edg......
  • webgpu_快速入门2
    /Users/song/Downloads/WebGPU视频教程/2.3D几何变换数学基础/9.片元的屏幕坐标/1.片元坐标/index.html<!DOCTYPEhtml><htmllang="en"><head><metacharset="UTF-8"><metahttp-equiv="X-UA-Compatible"content="IE......
  • 既然 WebSocket 支持双向通信,功能看似比 HTTP 强大,那么我们是不是可以基于 WebSocket
    答:1.HTTP协议稳定易实现,大部分Web开发后台都没有主动发送数据给前端的需求2.WebSocket协议相对复杂,维护长连接也需要增加服务器资源开销,还要处理连接端开后重连问题因此,WebSocket并不能取代HTTP,它只适合在高实时的场景,需要服务器给客户端主动推......
  • HTTP、WebSocket、gRPC 或 WebRTC:哪种通信协议最适合您的应用程序?
    在为您的应用程序选择通信协议时,有很多不同的选择。在本文中,我们将了解四种流行的解决方案:HTTP、WebSocket、gRPC和WebRTC。我们将通过调查其背后的技术、它的最佳用途及其优缺点来探索每个协议。我们的通信方式在不断改进:变得更快、更方便、更可靠。我们的通信方式已经从使用信鸽......
  • Web安全测试—Web应用安全测试
    安全测试的目的是设法使每个使用应用的人确信,及时面临恶意输入,应用本身仍然可以想宣传的那样正常工作。Web安全测试就是使用多种工具(手动工具和自动工具),来模拟和激发Web应用的活动。。模拟跨站式脚本攻击等恶意输入,通过手动或脚本的方法提交给Web应用。以相同......
  • 【WebSocket】
    一、介绍WebSocket是基于TCP的一种新的网络协议。可以实现浏览器与服务器之间实时、双向的通信二、对比 WebSocketHTTP连接长连接短连接通信单向,基于请求响应模式(先请求,后响应)双向通信(无请求,可响应)底层TCP连接三、问题既然WebSocket支持......
  • Web安全测试—Web应用的结构
    Web应用的一种分类方法是依据它们所拥有的可访问接口的数量和种类。简单的架构就是将所有的功能封装在一个或两个组件中,复杂的架构会有若干个组件,最复杂的架构是将若干个组件应用捆绑在一起。常见组件最常见的Web应用都基于模型-视图-控制器(MVC)设计。这种部署型式的......
  • 06web安全学习---信息搜集(The Soul of penetration test)
    声明学习网络安全,必须要坚守一个原则,那就是一定一定一定要遵守《中华人民共和国网络安全法》,做一个遵纪守法的好公民,不要利用技术做一些违法犯罪的事情,否则后果自负,请切记!!!一、为什么要信息收集(踩点)目的就是找到薄弱点进行attack;二、信息收集方向三、巧用网络空间搜索引擎 四、信息......
  • 05web安全学习---PHP正则表达式
    一、初识SQL注入<metacharset='GBK'/><form><center> 用户登录<br/>账号:<inputtype='text'name='uname'/><br/>密码:<inputtype='password'name='pwd'/><br/>&......