1-Pick distributions

- Claim sizes and frequencies are generally modelled using Gamma and Poisson distributions respectively.
- Otherwise, there would need to be further examination of the response variable.

2-Pick link function

- This depends on the nature of the response variable. For example, a non-negative variable would use the log link function, whereas a variable between 0 and 1 would use a logit link function.

3-Analyse data

- What explanatory variables have been provided?
- What does the response variable look like by each explanatory variable? One way summaries or pivot tables could be analysed for insight.
- Consider grouping for categorical variables.
- Consider transformations for variables with non-linear shapes.

4-Optimising/fitting using maximum likelihood estimation (most likely done with a program).

5-Assessing output and p values

- Some variables may be dropped based on their statistical insignificance.
- Data mining or decision trees can help to find areas that are not fitting well and refine them.
- Address any large individual observations or outliers distorting results.
- Fitting curves to reduce over-fitting.
- This may be an iterative process requiring judgment.

6-Testing how well the GLM predicts using a subset of the data