- Claim sizes and frequencies are generally modelled using Gamma and Poisson distributions respectively.
- Otherwise, there would need to be further examination of the response variable.
2-Pick link function
- This depends on the nature of the response variable. For example, a non-negative variable would use the log link function, whereas a variable between 0 and 1 would use a logit link function.
- What explanatory variables have been provided?
- What does the response variable look like by each explanatory variable? One way summaries or pivot tables could be analysed for insight.
- Consider grouping for categorical variables.
- Consider transformations for variables with non-linear shapes.
4-Optimising/fitting using maximum likelihood estimation (most likely done with a program).
5-Assessing output and p values
- Some variables may be dropped based on their statistical insignificance.
- Data mining or decision trees can help to find areas that are not fitting well and refine them.
- Address any large individual observations or outliers distorting results.
- Fitting curves to reduce over-fitting.
- This may be an iterative process requiring judgment.
6-Testing how well the GLM predicts using a subset of the data