Distributed tracing · Development · Help · GitLab (2024)

GitLab is instrumented for distributed tracing. Distributed tracing in GitLab is currently considered experimental, as it has not yet been tested at scale on GitLab.com.

According to Open Tracing:

Distributed tracing, also called distributed request tracing, is a method used to profile andmonitor applications, especially those built using a microservices architecture. Distributedtracing helps to pinpoint where failures occur and what causes poor performance.

Distributed tracing is especially helpful in understanding the life cycle of a request as it passesthrough the different components of the GitLab application. At present, Workhorse, Rails, Sidekiq,and Gitaly support tracing instrumentation.

Distributed tracing adds minimal overhead when disabled, but imposes only small overhead whenenabled and is therefore capable in any environment, including production. For this reason, it canbe useful in diagnosing production issues, particularly performance problems.

Services have different levels of support for distributed tracing. Custominstrumentation code must be added to the application layer in addition topre-built instrumentation for the most common libraries.

For service-specific information, see:

Using Correlation IDs to investigate distributed requests

The GitLab application passes correlation IDs between the various components in a request. Acorrelation ID is a token, unique to a single request, used to correlate a single request betweendifferent GitLab subsystems (for example, Rails, Workhorse). Since correlation IDs are included inlog output, Engineers can use the correlation ID to correlate logs from different subsystems andbetter understand the end-to-end path of a request through the system. When a request traversesprocess boundaries, the correlation ID is injected into the outgoing request. This enablesthe propagation of the correlation ID to each downstream subsystem.

Correlation IDs are usually generated in the Rails application in response tocertain web requests. Some user facing systems don't generate correlation IDs inresponse to user requests (for example, Git pushes over SSH).

Developer guidelines for working with correlation IDs

When integrating tracing into a new system, developers should avoid makingcertain assumptions about correlation IDs. The following guidelines apply toall subsystems at GitLab:

  • Correlation IDs are always optional.
    • Never have non-tracing features depend on the existence of a correlation IDfrom an upstream system.
  • Correlation IDs are always free text.
    • Correlation IDs should never be used to pass context (for example, a username or an IP address).
    • Correlation IDs should never be parsed, or manipulated in other ways (for example, split).

The LabKit library provides a standardized interface for working with GitLabcorrelation IDs in the Go programming language. LabKit can be used as areference implementation for developers working with tracing and correlation IDson non-Go GitLab subsystems.

Enabling distributed tracing

GitLab uses the GITLAB_TRACING environment variable to configure distributed tracing. The sameconfiguration is used for all components (for example, Workhorse, Rails, etc).

When GITLAB_TRACING is not set, the application isn't instrumented, meaning that there isno overhead at all.

To enable GITLAB_TRACING, a valid "configuration-string" value should be set, with a URL-likeform:

GITLAB_TRACING=opentracing://<driver>?<param_name>=<param_value>&<param_name_2>=<param_value_2>

In this example, we have the following hypothetical values:

  • driver: the driver such a Jaeger.
  • param_name, param_value: these are driver specific configuration values. Configurationparameters for Jaeger are documented further on in this documentthey should be URL encoded.Multiple values should be separated by & characters like a URL.

GitLab Rails provides pre-implemented instrumentations for common types ofoperations that offer a detailed view of the requests. However, the detailedinformation comes at a cost. The resulting traces are long and can be difficultto process, making it hard to identify bigger underlying issues. To address thisconcern, some instrumentations are disabled by default. To enable those disabledinstrumentations, set the following environment variables:

  • GITLAB_TRACING_TRACK_CACHES: enable tracking cache operations, such as cacheread, write, or delete.
  • GITLAB_TRACING_TRACK_REDIS: enable tracking Redis operations. Most Redisoperations are for caching, though.

Using Jaeger in the GitLab Development Kit

The first tracing implementation that GitLab supports is Jaeger, and theGitLab Development Kitsupports distributed tracing with Jaeger out-of-the-box. GDK automatically addsGITLAB_TRACING environment variables to add services.

Configure GDK for Jaeger by editing the gdk.yml file and adding the followingsettings:

tracer: build_tags: tracer_static tracer_static_jaeger jaeger: enabled: true listen_address: 127.0.0.1 version: 1.43.0

After modifying the gdk.yml file, reconfigure your GDK by runningthe gdk reconfigure command. This ensures that your GDK is properly configuredand ready to use.

The above configuration sets the tracer_static and tracer_static_jaegerbuild tags when rebuilding services written in Go for the first time. Anychanges made afterward require rebuilding them with those build tags. You caneither:

  • Add those build tags to the default set of build tags.
  • Manually attach them to the build command. For example, Gitaly supports addingbuild tag out of the box. You can runmake all WITH_BUNDLED_GIT=YesPlease BUILD_TAGS="tracer_static tracer_static_jaeger".

After reconfiguration, Jaeger dashboard is available athttp://localhost:16686. Another way to access tracing from a GDK environmentis through theperformance-bar.This can be shown by typing p b in the browser window.

Once the performance bar is enabled, select Trace in the performance bar to go tothe Jaeger UI.

The Jaeger search UI returns a query for the Correlation-ID of the current request.This search should return a single trace result. Selecting this result shows the detail of thetrace in a hierarchical time-line.

Using Jaeger without the GitLab Developer Kit

Distributed Tracing can be enabled in non-GDK development environments as well as production orstaging environments, for troubleshooting. At this time, this functionality isexperimental, and not supported in production environments at present. In this first release, it is intended to beused for debugging in development environments only.

Jaeger tracing can be enabled through a three-step process:

  1. Start Jaeger.
  2. Configure the GITLAB_TRACING environment variable.
  3. Start the GitLab application.
  4. Go to the Jaeger Search UI in your browser.

1. Start Jaeger

Jaeger has many configuration options, but is very easy to start in an "all-in-one" mode which usesmemory for trace storage (and is therefore non-persistent). The main advantage of "all-in-one" modebeing ease of use.

For more detailed configuration options, refer to theJaeger documentation.

Using Docker

If you have Docker available, the easier approach to running the Jaeger all-in-one is throughDocker, using the following command:

$ docker run \ --rm \ -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \ -p 5775:5775/udp \ -p 6831:6831/udp \ -p 6832:6832/udp \ -p 5778:5778 \ -p 16686:16686 \ -p 14268:14268 \ -p 9411:9411 \ jaegertracing/all-in-one:latest

Using the Jaeger process

Without Docker, the all-in-one process is still easy to set up.

  1. Download the latest Jaeger release for yourplatform.
  2. Extract the archive and run the bin/all-in-one process.

This should start the process with the default listening ports.

2. Configure the GITLAB_TRACING environment variable

Once you have Jaeger running, configure the GITLAB_TRACING variable with theappropriate configuration string.

If you're running everything on the same host, use the following value:

export GITLAB_TRACING="opentracing://jaeger?http_endpoint=http%3A%2F%2Flocalhost%3A14268%2Fapi%2Ftraces&sampler=const&sampler_param=1"

This configuration string uses the Jaeger driver opentracing://jaeger with the following options:

NameValueDescription
http_endpointhttp://localhost:14268/api/tracesConfigures Jaeger to send trace information to the HTTP endpoint running on http://localhost:14268/. Alternatively, the upd_endpoint can be used.
samplerconstConfigures Jaeger to use the constant sampler (either on or off).
sampler_param1Configures the const sampler to sample all traces. Using 0 would sample no traces.

Other parameter values are also possible:

NameExampleDescription
udp_endpointlocalhost:6831This is the default. Configures Jaeger to send trace information to the UDP listener on port 6831 using compact thrift protocol. Note that we've experienced some issues with the Jaeger Client for Ruby when using this protocol.
samplerprobabilisticConfigures Jaeger to use a probabilistic random sampler. The rate of samples is configured by the sampler_param value.
sampler_param0.01Use a ratio of 0.01 to configure the probabilistic sampler to randomly sample 1% of traces.
service_nameapiOverride the service name used by the Jaeger backend. This parameter takes precedence over the application-supplied value.

NOTE:The same GITLAB_TRACING value should to be configured in the environmentvariables for all GitLab processes, including Workhorse, Gitaly, Rails, and Sidekiq.

3. Start the GitLab application

After the GITLAB_TRACING environment variable is exported to all GitLab services, start theapplication.

When GITLAB_TRACING is configured properly, the application logs this on startup:

13:41:53 gitlab-workhorse.1 | 2019/02/12 13:41:53 Tracing enabled...13:41:54 gitaly.1 | 2019/02/12 13:41:54 Tracing enabled...

If GITLAB_TRACING is not configured correctly, this issue is logged:

13:43:45 gitaly.1 | 2019/02/12 13:43:45 skipping tracing configuration step: tracer: unable to load driver mytracer

By default, GitLab ships with the Jaeger tracer, but other tracers can be included at compile time.Details of how this can be done are included in theLabKit tracing documentation.

If no log messages about tracing are emitted, the GITLAB_TRACING environment variable is likelynot set.

4. Open the Jaeger Search UI

By default, the Jaeger search UI is available at http://localhost:16686/search.

NOTE:Don't forget that you must generate traces by using the application beforethey appear in the Jaeger UI.

Distributed tracing · Development · Help · GitLab (2024)

References

Top Articles
Latest Posts
Article information

Author: Pres. Lawanda Wiegand

Last Updated:

Views: 6427

Rating: 4 / 5 (71 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Pres. Lawanda Wiegand

Birthday: 1993-01-10

Address: Suite 391 6963 Ullrich Shore, Bellefort, WI 01350-7893

Phone: +6806610432415

Job: Dynamic Manufacturing Assistant

Hobby: amateur radio, Taekwondo, Wood carving, Parkour, Skateboarding, Running, Rafting

Introduction: My name is Pres. Lawanda Wiegand, I am a inquisitive, helpful, glamorous, cheerful, open, clever, innocent person who loves writing and wants to share my knowledge and understanding with you.