python - Regression in pandas -


i have 2 separate databases - temperature db hourly data , house db minute minute data hvac usage. i'm trying plot hvac data temperature series on week, month, , year, since increments don't match temperature db, i'm having trouble. i've tried making least squares fit, a) can't figure out how 1 in pandas , b) gets inaccurate after day or two. suggestions?

pandas timeseries prefect application. can merge series of different sample frequency , pandas align them perfectly. can downsample data , preform regression, i.e., statsmodels. mock-up example:

in [288]:  idx1=pd.date_range('2001/01/01', periods=10, freq='d') idx2=pd.date_range('2001/01/01', periods=500, freq='h') df1 =pd.dataframe(np.random.random(10), columns=['val1']) df2 =pd.dataframe(np.random.random(500), columns=['val2']) df1.index=idx1 df2.index=idx2 in [291]:  df3=pd.merge(df1, df2, left_index=true, right_index=true, how='inner') df4=df3.resample(rule='d') in [292]:  print df4                 val1      val2 2001-01-01  0.399901  0.244800 2001-01-02  0.014448  0.423780 2001-01-03  0.811747  0.070047 2001-01-04  0.595556  0.679096 2001-01-05  0.218412  0.116764 2001-01-06  0.961310  0.040317 2001-01-07  0.058964  0.606843 2001-01-08  0.075129  0.407842 2001-01-09  0.833003  0.751287 2001-01-10  0.070072  0.559986  [10 rows x 2 columns] in [294]:  import statsmodels.formula.api smf mod = smf.ols(formula='val1 ~ val2', data=df4) res = mod.fit() print res.summary()                             ols regression results                             ============================================================================== dep. variable:                   val1   r-squared:                       0.061 model:                            ols   adj. r-squared:                 -0.056 method:                 least squares   f-statistic:                    0.5231 date:                fri, 27 jun 2014   prob (f-statistic):              0.490 time:                        10:46:34   log-likelihood:                -3.3643 no. observations:                  10   aic:                             10.73 df residuals:                       8   bic:                             11.33 df model:                           1                                          ==============================================================================                  coef    std err          t      p>|t|      [95.0% conf. int.] ------------------------------------------------------------------------------ intercept      0.5405      0.224      2.417      0.042         0.025     1.056 val2          -0.3502      0.484     -0.723      0.490        -1.467     0.766 ============================================================================== omnibus:                        3.509   durbin-watson:                   2.927 prob(omnibus):                  0.173   jarque-bera (jb):                1.232 skew:                           0.399   prob(jb):                        0.540 kurtosis:                       1.477   cond. no.                         4.69 ============================================================================== 

Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -