Prometheus & Grafana ==================== In Production we use few tools for Monitoring an alerting 1. `Sentry `_: for crash monitoring on Python and Django side 2. `Crashlytics `_: for crash monitoring on Mobile end of things Historically we've used `Nagios `_ but we now prefer `Prometheus `_ and `Grafana `_. Prometheus is an open-source software project written in Go that is used to record real-time metrics in a time series database built using a HTTP pull model, with flexible queries and real-time alerting. Key concepts to understand are: 1. Prometheus as a core is time series database, stores bunch of metrics. 2. `Node Exporter `_ is what is responsible for collecting metrics from nodes and pushing it off to prometheus Installing Prometheus ````````````````````` 1. Create user for prometheus and node_exporter .. code-block:: sh sudo useradd --no-create-home --shell /bin/false prometheus sudo useradd --no-create-home --shell /bin/false node_exporter 2. Create prometheus directories .. code-block:: sh sudo mkdir /etc/prometheus sudo mkdir /var/lib/prometheus 3. Change owner of directories created .. code-block:: sh sudo chown prometheus:prometheus /etc/prometheus sudo chown prometheus:prometheus /var/lib/prometheus 4. Download latest prometheus binary for your operating system. .. code-block:: sh wget https://github.com/prometheus/prometheus/releases/download/v2.9.2/prometheus-2.9.2.linux-amd64.tar.gz 5. Extract compressed file .. code-block:: sh tar -zxvf prometheus-2.9.2.linux-amd64.tar.gz 6. Copy prometheus binary to /usr/local/bin .. code-block:: sh sudo cp prometheus-2.9.2.linux-amd64/prometheus /usr/local/bin/ sudo cp prometheus-2.9.2.linux-amd64/promtool /usr/local/bin/ 7. Change binary owner to prometheus .. code-block:: sh sudo chown prometheus:prometheus /usr/local/bin/prometheus sudo chown prometheus:prometheus /usr/local/bin/promtool 8. Copy prometheus config to /etc/prometheus .. code-block:: sh sudo cp -r prometheus-2.9.2.linux-amd64/consoles /etc/prometheus sudo cp -r prometheus-2.9.2.linux-amd64/console_libraries /etc/prometheus 9. Change config owner to prometheus .. code-block:: sh sudo chown -R prometheus:prometheus /etc/prometheus/consoles sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries 10. Create prometheus config file and add following config .. code-block:: sh sudo nano /etc/prometheus/prometheus.yml Update the file to contain .. code-block:: yaml global: scrape_interval: 15s scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] 11. Change owner of config file .. code-block:: sh sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml 12. Add prometheus to systemd server. Create service file and add following lines .. code-block:: sh sudo nano /etc/systemd/system/prometheus.service Update the file to contain .. code-block:: ini [Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file /etc/prometheus/prometheus.yml \ --storage.tsdb.path /var/lib/prometheus/ \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries [Install] WantedBy=multi-user.target 13. Reload daemon and start prometheus service .. code-block:: sh sudo systemctl daemon-reload sudo systemctl start prometheus sudo systemctl status prometheus Setting up Node Exporter ```````````````````````` To recap, Node Exporter is a Prometheus exporter for hardware and OS metrics with plug-able metric collectors. It allows to measure various machine resources such as memory, disk and CPU utilization. 1. Download prometheus node exporter binary .. code-block:: sh wget https://github.com/prometheus/node_exporter/releases/download/v0.17.0/node_exporter-0.17.0.linux-amd64.tar.gz 2. Extract compressed file .. code-block:: sh tar -zxvf node_exporter-0.17.0.linux-amd64.tar.gz 3. Copy binary to `/usr/local/bin` and change owner .. code-block:: sh sudo cp node_exporter-0.17.0.linux-amd64/node_exporter /usr/local/bin sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter 4. Add node exporter to `systemd` service. Create service file and add following lines .. code-block:: sh sudo nano /etc/systemd/system/node_exporter.service Update the file to contain .. code-block:: ini [Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target 5. Reload daemon and start prometheus service .. code-block:: sh sudo systemctl daemon-reload sudo systemctl start node_exporter sudo systemctl status node_exporter 6. Add node exporter config to prometheus .. code-block:: sh sudo nano /etc/prometheus/prometheus.yml Append following lines to `prometheus.yml` .. code-block:: yaml - job_name: 'node_exporter' scrape_interval: 5s static_configs: - targets: ['localhost:9100'] 7. Restart prometheus service .. code-block:: sh sudo systemctl restart prometheus sudo systemctl status prometheus Python Integration `````````````````` Integrating prometheus with python app is straight forward and and simple. First of all install prometheus python client library using following command. .. code-block:: sh pip install prometheus_client consider following python app. Create file `awesome_python_app.py` and add following content. .. code-block:: python from prometheus_client import start_http_server, Counter, Gauge import random import time # Initialize required metrics MY_COUNTER = Counter("total_number_of_requests", "Total number of requests processed") MY_GAUGE = Gauge("request_processing_seconds", 'Time spent processing request') # function which process your request def process_request(t): """A dummy function that takes some time.""" # increment counter each time function gets called MY_COUNTER.inc() # set time taken to process the request MY_GAUGE.set(t) # sleep for some time time.sleep(t) if __name__ == '__main__': # prometheus client library provides start_http_server which runs on different thread # Start up the server to expose the metrics. start_http_server(8000) # Generate some requests. while True: process_request(random.random()) Run above created file using following command. Open browser and goto url `http://localhost:8000` you will see prometheus metrics generated for each request Golang Integration `````````````````` `gRPC` for Go has support for Interceptors, {i.e. middleware that is executed before the request is passed onto the user's application logic}. Here we can integrate prometheus monitoring. Consider following example we'll see step by step process of integrating prometheus monitoring. Basic directory structure of gRPC app. .. code-block:: sh client/ client.go protobuf/ service.pb.go service.proto server/ server.go Following is initial content of `server.go` .. code-block:: go package main import ( "context" "fmt" "log" "google.golang.org/grpc" pb "github.com/go-grpc-prometheus/examples/grpc-server-with-prometheus/protobuf" ) // DemoServiceServer defines a Server. type DemoServiceServer struct{} func newDemoServer() *DemoServiceServer { return &DemoServiceServer{} } // SayHello implements a interface defined by protobuf. func (s *DemoServiceServer) SayHello(ctx context.Context, request *pb.HelloRequest) (*pb.HelloResponse, error) { return &pb.HelloResponse{Message: fmt.Sprintf("Hello %s", request.Name)}, nil } // NOTE: Graceful shutdown is missing. Don't use this demo in your production setup. func main() { // Listen an actual port. lis, err := net.Listen("tcp", fmt.Sprintf(":%d", 9093)) if err != nil { log.Fatalf("failed to listen: %v", err) } defer lis.Close() // Create a HTTP server for prometheus. httpServer := &http.Server{Handler: promhttp.HandlerFor(reg, promhttp.HandlerOpts{}), Addr: fmt.Sprintf("0.0.0.0:%d", 9092)} // Create a gRPC Server with gRPC interceptor. grpcServer := grpc.NewServer() // Create a new api server. demoServer := newDemoServer() // Register your service. pb.RegisterDemoServiceServer(grpcServer, demoServer) // Start your gRPC server. log.Fatal(grpcServer.Serve(lis)) } Following is content of protobuf file `service.proto` .. code-block:: go syntax="proto3"; package proto; service DemoService { rpc SayHello(HelloRequest) returns (HelloResponse) {} } message HelloRequest { string name = 1; } message HelloResponse { string message = 1; } `service.pb.go` is compiled protobuf file. Following is content of `client.go` .. code-block:: go package main import ( "bufio" "context" "fmt" "log" "os" "strings" "time" "google.golang.org/grpc" pb "github.com/go-grpc-prometheus/examples/grpc-server-with-prometheus/protobuf" ) func main() { // Create a insecure gRPC channel to communicate with the server. conn, err := grpc.Dial( fmt.Sprintf("localhost:%v", 9093), grpc.WithInsecure(), ) if err != nil { log.Fatal(err) } defer conn.Close() // Create a HTTP server for prometheus. httpServer := &http.Server{Handler: promhttp.HandlerFor(reg, promhttp.HandlerOpts{}), Addr: fmt.Sprintf("0.0.0.0:%d", 9094)} // Create a gRPC server client. client := pb.NewDemoServiceClient(conn) fmt.Println("Start to call the method called SayHello every 3 seconds") go func() { for { // Call “SayHello” method and wait for response from gRPC Server. _, err := client.SayHello(context.Background(), &pb.HelloRequest{Name: "Test"}) if err != nil { log.Printf("Calling the SayHello method unsuccessfully. ErrorInfo: %+v", err) log.Printf("You should to stop the process") return } time.Sleep(3 * time.Second) } }() scanner := bufio.NewScanner(os.Stdin) fmt.Println("You can press n or N to stop the process of client") for scanner.Scan() { if strings.ToLower(scanner.Text()) == "n" { os.Exit(0) } } } As shown in above files we've basic gRPC app with service called SayHello. Lets integrate prometheus on server side and client side. First of all add prometheus dependency Type following command in your terminal it will install `go-grpc-prometheus` package .. code-block:: sh go get "github.com/grpc-ecosystem/go-grpc-prometheus" On Server side import prometheus package and create grpc server with prometheus interceptor. Create new customized metric counter, register metric and use this metric to increment count in every service call. Following is `server.go` looks like after adding prometheus metric. .. code-block:: go package main import ( "context" "fmt" "log" "net" "net/http" "google.golang.org/grpc" "github.com/grpc-ecosystem/go-grpc-prometheus" pb "github.com/grpc-ecosystem/go-grpc-prometheus/examples/grpc-server-with-prometheus/protobuf" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" ) // DemoServiceServer defines a Server. type DemoServiceServer struct{} func newDemoServer() *DemoServiceServer { return &DemoServiceServer{} } // SayHello implements a interface defined by protobuf. func (s *DemoServiceServer) SayHello(ctx context.Context, request *pb.HelloRequest) (*pb.HelloResponse, error) { customizedCounterMetric.WithLabelValues(request.Name).Inc() return &pb.HelloResponse{Message: fmt.Sprintf("Hello %s", request.Name)}, nil } var ( // Create a metrics registry. reg = prometheus.NewRegistry() // Create some standard server metrics. grpcMetrics = grpc_prometheus.NewServerMetrics() // Create a customized counter metric. customizedCounterMetric = prometheus.NewCounterVec(prometheus.CounterOpts{ Name: "demo_server_say_hello_method_handle_count", Help: "Total number of RPCs handled on the server.", }, []string{"name"}) ) func init() { // Register standard server metrics and customized metrics to registry. reg.MustRegister(grpcMetrics, customizedCounterMetric) customizedCounterMetric.WithLabelValues("Test") } // NOTE: Graceful shutdown is missing. Don't use this demo in your production setup. func main() { // Listen an actual port. lis, err := net.Listen("tcp", fmt.Sprintf(":%d", 9093)) if err != nil { log.Fatalf("failed to listen: %v", err) } defer lis.Close() // Create a HTTP server for prometheus. httpServer := &http.Server{Handler: promhttp.HandlerFor(reg, promhttp.HandlerOpts{}), Addr: fmt.Sprintf("0.0.0.0:%d", 9092)} // Create a gRPC Server with gRPC interceptor. grpcServer := grpc.NewServer( grpc.StreamInterceptor(grpcMetrics.StreamServerInterceptor()), grpc.UnaryInterceptor(grpcMetrics.UnaryServerInterceptor()), ) // Create a new api server. demoServer := newDemoServer() // Register your service. pb.RegisterDemoServiceServer(grpcServer, demoServer) // Initialize all metrics. grpcMetrics.InitializeMetrics(grpcServer) // Start your http server for prometheus. go func() { if err := httpServer.ListenAndServe(); err != nil { log.Fatal("Unable to start a http server.") } }() // Start your gRPC server. log.Fatal(grpcServer.Serve(lis)) } Let's integrate same prometheus metric on client side. We have to create customized metric, register that metric and use that metric to increment number of response from server. Following is `client.go` looks like after adding prometheus metric. .. code-block:: go package main import ( "bufio" "context" "fmt" "log" "net/http" "os" "strings" "time" "google.golang.org/grpc" "github.com/grpc-ecosystem/go-grpc-prometheus" pb "github.com/grpc-ecosystem/go-grpc-prometheus/examples/grpc-server-with-prometheus/protobuf" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { // Create a metrics registry. reg := prometheus.NewRegistry() // Create some standard client metrics. grpcMetrics := grpc_prometheus.NewClientMetrics() // Register client metrics to registry. reg.MustRegister(grpcMetrics) // Create a insecure gRPC channel to communicate with the server. conn, err := grpc.Dial( fmt.Sprintf("localhost:%v", 9093), grpc.WithUnaryInterceptor(grpcMetrics.UnaryClientInterceptor()), grpc.WithStreamInterceptor(grpcMetrics.StreamClientInterceptor()), grpc.WithInsecure(), ) if err != nil { log.Fatal(err) } defer conn.Close() // Create a HTTP server for prometheus. httpServer := &http.Server{Handler: promhttp.HandlerFor(reg, promhttp.HandlerOpts{}), Addr: fmt.Sprintf("0.0.0.0:%d", 9094)} // Start your http server for prometheus. go func() { if err := httpServer.ListenAndServe(); err != nil { log.Fatal("Unable to start a http server.") } }() // Create a gRPC server client. client := pb.NewDemoServiceClient(conn) fmt.Println("Start to call the method called SayHello every 3 seconds") go func() { for { // Call “SayHello” method and wait for response from gRPC Server. _, err := client.SayHello(context.Background(), &pb.HelloRequest{Name: "Test"}) if err != nil { log.Printf("Calling the SayHello method unsuccessfully. ErrorInfo: %+v", err) log.Printf("You should to stop the process") return } time.Sleep(3 * time.Second) } }() scanner := bufio.NewScanner(os.Stdin) fmt.Println("You can press n or N to stop the process of client") for scanner.Scan() { if strings.ToLower(scanner.Text()) == "n" { os.Exit(0) } } } Run both files `server.go` and `client.go`, open browser and goto following urls you will see prometheus metrics for both server and client. 1. Server Metrics URL : `http://localhost:9093/metrics` 2. Client Metrics URL : `http://localhost:9094/metrics`