logo

[time-series]

시계열 회귀

import pandas as pd

df = pd.read_excel('uschange.xlsx')
df.Consumption.plot(legend=True)
df.Income.plot(legend=True)

:::note[output]

<Axes: >
<Figure size 640x480 with 1 Axes>

:::

from statsmodels.formula.api import ols
ols('Consumption ~ Income', df).fit().summary()

:::note[output]

<class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results
==============================================================================
Dep. Variable:            Consumption   R-squared:                       0.159
Model:                            OLS   Adj. R-squared:                  0.154
Method:                 Least Squares   F-statistic:                     34.98
Date:                Thu, 08 Aug 2024   Prob (F-statistic):           1.58e-08
Time:                        14:48:02   Log-Likelihood:                -169.62
No. Observations:                 187   AIC:                             343.2
Df Residuals:                     185   BIC:                             349.7
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.5451      0.056      9.789      0.000       0.435       0.655
Income         0.2806      0.047      5.915      0.000       0.187       0.374
==============================================================================
Omnibus:                       16.528   Durbin-Watson:                   1.696
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               29.145
Skew:                          -0.454   Prob(JB):                     4.69e-07
Kurtosis:                       4.707   Cond. No.                         2.08
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
"""

:::

ols('Consumption ~ Income + Production + Unemployment + Savings', df).fit().summary()

:::note[output]

<class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results
==============================================================================
Dep. Variable:            Consumption   R-squared:                       0.754
Model:                            OLS   Adj. R-squared:                  0.749
Method:                 Least Squares   F-statistic:                     139.5
Date:                Thu, 08 Aug 2024   Prob (F-statistic):           2.62e-54
Time:                        14:48:45   Log-Likelihood:                -54.692
No. Observations:                 187   AIC:                             119.4
Df Residuals:                     182   BIC:                             135.5
Df Model:                           4
Covariance Type:            nonrobust
================================================================================
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
Intercept        0.2673      0.037      7.184      0.000       0.194       0.341
Income           0.7145      0.042     16.934      0.000       0.631       0.798
Production       0.0459      0.026      1.773      0.078      -0.005       0.097
Unemployment    -0.2048      0.106     -1.941      0.054      -0.413       0.003
Savings         -0.0453      0.003    -16.287      0.000      -0.051      -0.040
==============================================================================
Omnibus:                       18.566   Durbin-Watson:                   2.169
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               28.796
Skew:                           0.564   Prob(JB):                     5.58e-07
Kurtosis:                       4.556   Cond. No.                         61.9
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
"""

:::

 

추세

n = len(df)
n

:::note[output]

187

:::

df['t'] = np.arange(n)
ols('Consumption ~ t', df).fit().summary()

:::note[output]

<class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results
==============================================================================
Dep. Variable:            Consumption   R-squared:                       0.025
Model:                            OLS   Adj. R-squared:                  0.020
Method:                 Least Squares   F-statistic:                     4.726
Date:                Thu, 08 Aug 2024   Prob (F-statistic):             0.0310
Time:                        15:03:24   Log-Likelihood:                -183.46
No. Observations:                 187   AIC:                             370.9
Df Residuals:                     185   BIC:                             377.4
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.9242      0.095      9.777      0.000       0.738       1.111
t             -0.0019      0.001     -2.174      0.031      -0.004      -0.000
==============================================================================
Omnibus:                       46.181   Durbin-Watson:                   1.333
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              115.917
Skew:                          -1.058   Prob(JB):                     6.74e-26
Kurtosis:                       6.225   Cond. No.                         214.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
"""

:::

 

계절성

year  = n // 4 # 몫
df['season'] = np.tile([1,2,3,4], year + 1)[:n]
ols('Consumption ~ C(season)', df).fit().summary()

:::note[output]

<class 'statsmodels.iolib.summary.Summary'>
"""
                            OLS Regression Results
==============================================================================
Dep. Variable:            Consumption   R-squared:                       0.009
Model:                            OLS   Adj. R-squared:                 -0.007
Method:                 Least Squares   F-statistic:                    0.5513
Date:                Thu, 08 Aug 2024   Prob (F-statistic):              0.648
Time:                        15:11:51   Log-Likelihood:                -184.97
No. Observations:                 187   AIC:                             377.9
Df Residuals:                     183   BIC:                             390.9
Df Model:                           3
Covariance Type:            nonrobust
==================================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
Intercept          0.7356      0.096      7.667      0.000       0.546       0.925
C(season)[T.2]    -0.0296      0.136     -0.218      0.828      -0.297       0.238
C(season)[T.3]     0.1143      0.136      0.842      0.401      -0.153       0.382
C(season)[T.4]    -0.0424      0.136     -0.311      0.756      -0.312       0.227
==============================================================================
Omnibus:                       29.475   Durbin-Watson:                   1.283
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               63.812
Skew:                          -0.718   Prob(JB):                     1.39e-14
Kurtosis:                       5.476   Cond. No.                         4.78
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
"""

:::

p가 모두 0.05보다 크기 때문에 계절성이 없다는 것을 기각할 수 없다

계절성이 있을 수는 있지만 없을 가능성을 배제 못함

Previous
회귀분석