生产环境中收集诊断信息
在生产环境中,收集诊断信息(如跟踪、日志、度量和转储)可能具有挑战性。通常,必须访问环境,安装一些工具,然后收集信息。dotnet-monitor 简化并统一了收集诊断信息的方式,通过暴露一个 REST API,无论您的应用程序在哪里执行(在您的本地机器上,内部服务器上,或在 Kubernetes 集群内)。根据我们的需求,dotnet-monitor 可能会替代其他 .NET 诊断工具,如 dotnet-counters、dotnet-dump、dotnet-gcdump 和 dotnet-trace,特别是在信息收集的上下文中。
设置
我们可以使用以下命令将其安装为全局工具:
dotnet tool install --global dotnet-monitor --version 8.0.0-rc.2.23502.11
安装完成后,我们可以通过以下命令启动:
dotnet monitor collect --no-auth
dotnet-monitor 包括用于在 . 中浏览 API 表面的 Swagger UI。要测试工具,我们将使用一个标准的 .NET 应用程序。运行以下命令:
dotnet new web - o DotNetMonitorSandBox
dotnet new sln - n DotNetMonitorSandbox
dotnet sln add --in-root DotNetMonitorSandBox
Processes
Processes API 列出了可以检测到的进程并获取它们的元数据。打开浏览器并导航到 https://localhost:52323/processes 以列出可用的进程(确保您已运行我们的示例应用程序):
[
{
"pid": 19828,
"uid": "66140161-2208-4e7c-b874-79aa037d4344",
"name": "dotnet",
"isDefault": false
},
{
"pid": 57388,
"uid": "2b0aba55-1579-41a5-b6a7-c1575650352a",
"name": "DotNetMonitorSandBox",
"isDefault": false
}
]
属性表示进程的 ID。属性在进程运行在进程 ID 可能不唯一的环境中(例如,在 Kubernetes pod 内的多个容器将具有进程 ID 为 1 的入口点进程)时,对于唯一标识进程非常有用。导航到 https://localhost:52323/process?pid={pid} 以查看更多信息或获取指定进程的环境变量。
Logs
Logs API 使我们能够收集记录到 ILogger<> 基础结构的日志。打开 Program.cs 文件并更新内容如下:
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
app.Logger.LogInformation("Hello World!");
return "Hello World!";
});
app.Run();
运行应用程序并导航到 https://localhost:52323/logs?pid={pid}&durationSeconds=60 以在接下来的 60 秒内实时查看我们的日志记录。
Traces
Traces API 使我们能够收集格式化的跟踪。要使用预定义的跟踪配置文件集(如 Cpu、HttpLogs、Metrics)捕获进程的跟踪,请导航到 https://localhost:52323/trace?pid={pid}&durationSeconds=60 并等待获取 .nettrace 文件。打开 Program.cs 文件并更新内容如下:
using System.Diagnostics.Tracing;
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
app.Logger.LogInformation("Hello World!");
MyEventSource.Log.Request("Hello World!");
return "Hello World!";
});
app.Run();
[EventSource(Name = "MyEventSource")]
public sealed class MyEventSource : EventSource
{
public static MyEventSource Log { get; } = new MyEventSource();
[Event(1, Level = EventLevel.Informational)]
public void Request(string message)
{
WriteEvent(1, message);
}
}
要捕获自定义事件提供程序的跟踪,我们需要对相同的端点进行 POST 调用,并使用以下请求正文:
{
"Providers": [{
"Name": "MyEventSource",
"EventLevel": "Informational"
}],
"BufferSizeInMB": 1024
}
在 Windows 上,.nettrace 文件可以在 PerfView 中查看以进行分析,或在 Visual Studio 中查看。
Metrics
Metrics API 获取单个进程的 Prometheus 暴露格式的度量快照(pid 将通过配置设置)。dotnet-monitor 可以从多个来源读取和合并配置。Windows 的文件设置路径为 %USERPROFILE%.dotnet-monitor\settings.json。因此,让我们用以下内容更新文件(如果不存在,请创建它):
{
"DefaultProcess": {
"Filters": [{
"Key": "ProcessId",
"Value": "<pid>"
}]
},
}
配置将自动被 dotnet-monitor 加载。默认情况下,收集的度量来自以下提供程序:
System.Runtime
Microsoft.AspNetCore.Hosting
Grpc.AspNetCore.Server
导航到 https://localhost:52323/metrics 查看类似于以下的输出:
# HELP systemruntime_cpu_usage_ratio CPU Usage
# TYPE systemruntime_cpu_usage_ratio gauge
systemruntime_cpu_usage_ratio 0 1699198374885
systemruntime_cpu_usage_ratio 0 1699198379898
systemruntime_cpu_usage_ratio 0 1699201002325
# HELP systemruntime_working_set_bytes Working Set
# TYPE systemruntime_working_set_bytes gauge
systemruntime_working_set_bytes 63393792 1699198364894
systemruntime_working_set_bytes 63401984 1699198369888
systemruntime_working_set_bytes 63418368 1699198374885
# HELP systemruntime_gc_heap_size_bytes GC Heap Size
# TYPE systemruntime_gc_heap_size_bytes gauge
systemruntime_gc_heap_size_bytes 7085504 1699198364894
systemruntime_gc_heap_size_bytes 7093696 1699198369888
systemruntime_gc_heap_size_bytes 7110080 1699198374885
dotnet-monitor 支持 System.Diagnostics.Metrics(.NET 8 应用程序)基于 API 和 EventCounters(您可以在这里查看度量 API 之间的区别)。由于我们使用的是 .NET 7,我们将按如下方式修改 Program.cs
文件:
using System.Diagnostics.Tracing;
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () =>
{
app.Logger.LogInformation("Hello World!");
MyEventSource.Log.Request("Hello World!");
return "Hello World!";
});
app.Run();
[EventSource(Name = "MyEventSource")]
public sealed class MyEventSource : EventSource
{
public static MyEventSource Log { get; } = new MyEventSource();
private EventCounter _counter;
public MyEventSource()
{
_counter = new EventCounter("my-custom-counter", this)
{
DisplayName = "my-custom-counter",
DisplayUnits = "ms"
};
}
[Event(1, Level = EventLevel.Informational)]
public void Request(string message)
{
WriteEvent(1, message);
_counter.WriteMetric(1);
}
}
要捕获自定义事件提供程序的度量,我们需要修改 settings.json 文件,如下所示:
{
"Metrics": {
"Providers": [
{
"ProviderName": "MyEventSource",
"CounterNames": [
"my-custom-counter"
]
}
]
},
"DefaultProcess": {
"Filters": [{
"Key": "ProcessId",
"Value": "<pid>"
}]
},
}
Live Metrics
Live Metrics API 为所选进程捕获度量(与度量部分中列出的相同默认提供程序)。导航到 https://localhost:52323/livemetrics?pid={pid}&durationSeconds=60 并等待获取 .json 文件。要捕获自定义事件提供程序的实时度量,我们需要对同一端点进行调用,并使用以下请求正文:
{
"includeDefaultProviders": false,
"providers": [
{
"providerName": "MyEventSource",
"counterNames": [
"my-custom-counter"
]
}
]
}
输出将类似于以下内容:
{"timestamp":"2023-11-05T18:52:32.5333078-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:37.5321623-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:42.5360839-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
{ "timestamp":"2023-11-05T18:52:47.5309596-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1} { "timestamp":"2023-11-05T18:52:52.5323712-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1} { "timestamp":"2023-11-05T18:52:57.5310386-05:00","provider":"MyEventSource","name":"my-custom-counter","displayName":"my-custom-counter","unit":"ms","counterType":"Metric","tags":"","value":1}
Dump
Dump API 在不使用调试器的情况下捕获指定进程的托管转储。导航到 https://localhost:52323/dump?pid={pid}&durationSeconds=60 并等待获取 .dmp
文件(在收集转储时,应用程序将被挂起)。转储文件可以使用诸如 dotnet-dump 或 Visual Studio 之类的工具进行分析。在捕获时,转储文件不能在具有不同操作系统/架构的计算机上进行分析。
GCDump
GCDump API 捕获指定进程的 GC 转储。导航到 https://localhost:52323/gcdump?pid={pid}&durationSeconds=60 并等待获取 .gcdump
文件。除了 Visual Studio 外,我们还可以使用 PerfView 分析 gcdump 文件,并使用 dotnet-gcdump 生成报告。与转储文件不同,gcdump 文件是一种可移植格式,无论在哪个平台上收集,都可以进行分析。