Wednesday, May 25, 2022

Grafana Alert Rules, Contact Points and Notification Policies with Azure

Recently Microsoft announed that they will have a grana service available to use on Azure – awesome stuff.

I like Grafana for dashboards, it’s got a bit to go however, especially when it comes to alerts and doing things at scale.

You can choose to run Grafana locally (I’m running it on windows), you can run it in a container and you can run it on Azure, no matter where you use it, there are a few things which I wanted to cover off to help people who are considering using it.

Now I am currently using version 8.4.5 and I wanted to create some azure alerts to see what Grafana had in the way of alerts, it has some nice stuff to be fair, how it goes about it needs some work but I reckon it will definitely get there in upcoming versions.

Currently creating dashboards is very simple, when it comes to azure you need to:-

  • Create a data source (azure monitor)
  • Add a panel to a new dashboard
  • Select the data source and then choose either Metrics, Logs or Azure Resource Graph.
  • Fill out the details

Simple stuff, now what about if you want to create an alert, well the dashboard you create is stored in JSON which contains all of the panels and the settings, etc. – alerts are stored separately, to be honest I think alerts are still being worked on.

Anyways alerts are stored elsewhere, the good news is there is an API for Grafana, the bad news is, its not the best, either that or the documentation is wrong – if you try it out and it all just works please do give me a shout.

If you want to create alert rules which is, you creating settings, that define when an alert should fire, lets say if your virtual machine cpu goes abouve 75% cpu for between 1-5 minutes then raise a alert. An Alert is made up of whats called contact points ans notification policies in Grafana, now this idea I do like.

To create an Alert Rule you could do the following:-

POST http://localhost:3000/api/ruler/grafana/api/v1/rules/{your folder name here}

In the JSON believe replace {your datasource uid here} with the uid of your own datasource and also replace {your subscription id here} with your own subsciptionId.

{
    "name": "FUNCTION APPS - HTTP Server Errors (Total)",
    "interval": "1m",
    "rules": [
        {
            "expr": "",
            "for": "5m",
            "labels": {
                "Customer": "test customer",
                "alertto": "gregor"
            },
            "annotations": {
                "summary": "FUNCTION APPS - HTTP Server Errors > 100"
            },
            "grafana_alert": {
                "id": 115,
                "orgId": 27,
                "title": "FUNCTION APPS - HTTP Server Errors (Total)",
                "condition": "B",
                "data": [
                    {
                        "refId": "A",
                        "queryType": "Azure Monitor",
                        "relativeTimeRange": {
                            "from": 600,
                            "to": 0
                        },
                        "datasourceUid": "{your datasource uid here}",
                        "model": {
                            "azureMonitor": {
                                "aggregation": "Total",
                                "alias": "{{ resourcename }} - {{ metric }}",
                                "dimensionFilters": [],
                                "metricDefinition": "Microsoft.Web/sites",
                                "metricName": "Http5xx",
                                "metricNamespace": "Microsoft.Web/sites",
                                "resourceGroup": "rg-grafanaresources",
                                "resourceName": "grafana1",
                                "timeGrain": "auto"
                            },
                            "hide": false,
                            "intervalMs": 1000,
                            "maxDataPoints": 43200,
                            "queryType": "Azure Monitor",
                            "refId": "A",
                            "subscription": "{your subscription id here}"
                        }
                    },
                    {
                        "refId": "B",
                        "queryType": "",
                        "relativeTimeRange": {
                            "from": 0,
                            "to": 0
                        },
                        "datasourceUid": "-100",
                        "model": {
                            "conditions": [
                                {
                                    "evaluator": {
                                        "params": [
                                            100
                                        ],
                                        "type": "gt"
                                    },
                                    "operator": {
                                        "type": "and"
                                    },
                                    "query": {
                                        "params": [
                                            "A"
                                        ]
                                    },
                                    "reducer": {
                                        "params": [],
                                        "type": "last"
                                    },
                                    "type": "query"
                                }
                            ],
                            "datasource": {
                                "type": "__expr__",
                                "uid": "-100"
                            },
                            "hide": false,
                            "intervalMs": 1000,
                            "maxDataPoints": 43200,
                            "refId": "B",
                            "type": "classic_conditions"
                        }
                    }
                ],
                "intervalSeconds": 60,
                "rule_group": "FUNCTION APPS - HTTP Server Errors (Total)",
                "no_data_state": "NoData",
                "exec_err_state": "Alerting"
            }
        },
        {
            "expr": "",
            "for": "5m",
            "labels": {
                "Customer": "test customer",
                "alertto": "gregor"
            },
            "annotations": {
                "summary": "Azure SQL - DATA IO % > 75%"
            },
            "grafana_alert": {
                "id": 121,
                "orgId": 27,
                "title": "Azure SQL - Log IO %",
                "condition": "B",
                "data": [
                    {
                        "refId": "A",
                        "queryType": "Azure Monitor",
                        "relativeTimeRange": {
                            "from": 600,
                            "to": 0
                        },
                        "datasourceUid": "{your datasource uid here}",
                        "model": {
                            "azureMonitor": {
                                "aggregation": "Average",
                                "alias": "{{ resourcename }} - {{ metric }}",
                                "dimensionFilters": [],
                                "metricDefinition": "Microsoft.Sql/servers/databases",
                                "metricName": "log_write_percent",
                                "metricNamespace": "Microsoft.Sql/servers/databases",
                                "resourceGroup": "rg-grafanaresources",
                                "resourceName": "grafanadb/grafanadb",
                                "timeGrain": "auto"
                            },
                            "hide": false,
                            "intervalMs": 1000,
                            "maxDataPoints": 43200,
                            "queryType": "Azure Monitor",
                            "refId": "A",
                            "subscription": "{your subscription id here}"
                        }
                    },
                    {
                        "refId": "B",
                        "queryType": "",
                        "relativeTimeRange": {
                            "from": 0,
                            "to": 0
                        },
                        "datasourceUid": "-100",
                        "model": {
                            "conditions": [
                                {
                                    "evaluator": {
                                        "params": [
                                            75
                                        ],
                                        "type": "gt"
                                    },
                                    "operator": {
                                        "type": "and"
                                    },
                                    "query": {
                                        "params": [
                                            "A"
                                        ]
                                    },
                                    "reducer": {
                                        "params": [],
                                        "type": "last"
                                    },
                                    "type": "query"
                                }
                            ],
                            "datasource": {
                                "type": "__expr__",
                                "uid": "-100"
                            },
                            "hide": false,
                            "intervalMs": 1000,
                            "maxDataPoints": 43200,
                            "refId": "B",
                            "type": "classic_conditions"
                        }
                    }
                ],
                "intervalSeconds": 60,
                "rule_group": "Azure SQL - Log IO %",
                "no_data_state": "NoData",
                "exec_err_state": "Alerting"
            }
        }        
    ]
}

Lots of companies have products that produce nice dashboards, but in my opinion a dashboard is useless on its own, you shouldn’t have to look at a dashboard for the most part, especially if you’re doing something at scale at least. So, I want to have a dashboard with alerts that email me or create a TopDesk ticket or ServiceNow ticket when there is something awry.

Contact points in Grafana are basically how should someone or something be contacted, these are normally email addresses or end points like an azure function endpoint which you can use to create tickets for example.

Notification policies are policies that act on the settings you provide, an example would be if a label is matched then use of of the contact points to do something – so if an alert is raised and the label is production on your dashboard then you can send an alert to the contact point you created to call an azure function which will create a ServiceNow ticket.

The Grafana API can be found here – https://editor.swagger.io/?url=https://raw.githubusercontent.com/grafana/grafana/main/pkg/services/ngalert/api/tooling/post.json

It’s an interesting mix of v1 / v2 end points and some work some don’t. I have had no luck getting endpoints for contact points and notification policies to work – but you can use the following calls to get and save the config should you want to create more of these at scale in other dashboards.

GET http://localhost:3000/api/alertmanager/grafana/config/api/v1/alerts

{
    "template_files": {},
    "alertmanager_config": {
        "route": {
            "receiver": "grafana-default-email",
            "routes": [
                {
                    "object_matchers": [
                        [
                            "customer",
                            "=",
                            "test customer"
                        ]
                    ]
                }
            ]
        },
        "templates": null,
        "receivers": [
            {
                "name": "grafana-default-email",
                "grafana_managed_receiver_configs": [
                    {
                        "uid": "ED40XnQnz",
                        "name": "email receiver",
                        "type": "email",
                        "disableResolveMessage": false,
                        "settings": {
                            "addresses": "<example@email.com>"
                        },
                        "secureFields": {}
                    }
                ]
            },
            {
                "name": "Gregor Suttie",
                "grafana_managed_receiver_configs": [
                    {
                        "uid": "ED4AunQ7kz",
                        "name": "Gregor Suttie",
                        "type": "email",
                        "disableResolveMessage": false,
                        "settings": {
                            "addresses": "azuregreg@azure.com",
                            "singleEmail": false
                        },
                        "secureFields": {}
                    }
                ]
            }
        ]
    }
}

Ans you can post the same JSON (without the uid filled out) to create Contact points and Notification policies)

POST http://localhost:3000/api/alertmanager/grafana/config/api/v1/alerts

{
    "template_files": {},
    "alertmanager_config": {
        "route": {
            "receiver": "grafana-default-email",
            "routes": [
                {
                    "object_matchers": [
                        [
                            "customer",
                            "=",
                            "test customer"
                        ]
                    ]
                }
            ]
        },
        "templates": null,
        "receivers": [
            {
                "name": "grafana-default-email",
                "grafana_managed_receiver_configs": [
                    {
                        "uid": "",
                        "name": "email receiver",
                        "type": "email",
                        "disableResolveMessage": false,
                        "settings": {
                            "addresses": "<example@email.com>"
                        },
                        "secureFields": {}
                    }
                ]
            },
            {
                "name": "Gregor Suttie",
                "grafana_managed_receiver_configs": [
                    {
                        "uid": "",
                        "name": "Gregor Suttie",
                        "type": "email",
                        "disableResolveMessage": false,
                        "settings": {
                            "addresses": "azuregreg@azure.com",
                            "singleEmail": false
                        },
                        "secureFields": {}
                    }
                ]
            }
        ]
    }
}

API – the API for graphene is, as I mentioned before I bit hit and miss.I use it from Postman and here us how I set Postman to get it working.

I create an API key from within Grafana (under Configuration and then API Keys) and set that as a Bearer Token under the Autentication section in Postman like so:-

And the Headers are pretty standard like so:-

If you have questions or get stuck, reach out to me here int he comments below or on twitter.

avatar
Gregor Suttiehttps://gregorsuttie.com
I have over 20 years experience in various roles throughout my career mainly working with Microsoft technologies, roles include, developer, site reliability engineer, architect, team lead, team manager. Experience building large Azure solutions using numerous Azure services to deliver the right solutions to our customers.

Related Articles

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Stay Connected

6,065FollowersFollow
5,933FollowersFollow

Subscribe to our newsletter

To be updated with all the latest news, offers and special announcements.

Latest Articles