Ok here goes. It may be tough to get it all in layman’s terms but I’ve tried to simplify the concepts as much as I can without getting too mathematical. (note the charts below have been sourced from other websites)
Firstly differencing is as the name suggests taking the difference between two terms. For normal differencing you are trying to remove any trends. For seasonal differencing you are trying to remove any trends across seasons. The aim of this is to ensure the series is stationary before modelling. See below an example. The bottom one is non stationary while the one above is considered stationary.
Normal differencing as referred to in ARIMA (and more generally time series modelling) is used to de-trend the data. 1 level of differencing would mean you take the current value and subtract the prior value from it. This is done for all time points and you get another resultant time series. If this series still shows a trend then you can do another level of differencing with the first level differenced series. This would result in new series with 2 levels of differencing applied.
Seasonal differencing takes the difference between the current value and the previous seasons value at the same time. For example, if you have are modelling at the monthly level then the differencing applied would be x(t) – x(t-12). This will remove any first level seasonal trends. Higher levels would just be taking the difference of the differences. Below is an example where you can see that the seasonal component is getting larger and larger. This is a case where it would be reasonable to do seasonal differencing. Obviously in practice you are rarely working with such clean data. This series would also require normal differencing to remove the upwards trend.
If you were to combine both seasonal and normal differencing then it would look something like this. You take the normal difference first – then you take the seasonal difference.
(x(t) - x(t-1)) - (x(t-12) - x(t-13))
Moving on to the autoregressive and moving average components. If you want to think of it quite simply in terms of a regression model then autoregressive components refer to prior values of the current value. So if you consider x(t) as the current value then the first AR component is x(t-1) multiplied by a fitted coefficient. The second AR component would be x(t-2) and so on. These are often referred to as lagged terms. So the prior value is called the first lag, and the one prior that the second lag, and so on.
Moving average is where it gets a bit complicated. The name is misleading a bit. These components don’t actually refer to the moving average of the series itself. But rather try to use past errors to predict the current value. To put it simply if your series is stationary then it will have a constant mean, however at any point in time it may not actually be at the mean. The distance from the mean is called the error.
To put it mathematically:
x(t) = Mean + Error(t) x(t-1) = Mean + Error(t-1)
So the first order of a moving average component would be Error(t-1), second order Error(t-2), and so on.
Hope that helps and improves your understanding of ARIMA time series models and it’s underlying components.