Release 0.5.0¶
statsmodels 0.5 is a large and very exciting release that brings together a year of work done by 38 authors, including over 2000 commits. It contains many new features and a large amount of bug fixes detailed below.
See the list of fixed issues for specific closed issues.
The following major new features appear in this version.
Support for Model Formulas via Patsy¶
statsmodels now supports fitting models with a formula. This functionality is provided by patsy. Patsy is now a dependency for statsmodels. Models can be individually imported from the statsmodels.formula.api
namespace or you can import them all as:
import statsmodels.formula.api as smf
Alternatively, each model in the usual statsmodels.api
namespace has a from_formula
classmethod that will create a model using a formula. Formulas are also available for specifying linear hypothesis tests using the t_test
and f_test
methods after model fitting. A typical workflow can now look something like this.
import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
url = 'https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/csv/HistData/Guerry.csv'
data = pd.read_csv(url)
# Fit regression model (using the natural log of one of the regressors)
results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=data).fit()
See here for some more documentation of using formulas in statsmodels
Empirical Likelihood (Google Summer of Code 2012 project)¶
Empirical Likelihood-Based Inference for moments of univariate and multivariate variables is available as well as EL-based ANOVA tests. EL-based linear regression, including the regression through the origin model. In addition, the accelerated failure time model for inference on a linear regression model with a randomly right censored endogenous variable is available.
Analysis of Variance (ANOVA) Modeling¶
Support for ANOVA is now available including type I, II, and III sums of squares. See ANOVA.
Multivariate Kernel Density Estimators (GSoC 2012 project)¶
Kernel density estimation has been extended to handle multivariate estimation as well via product kernels. It is available as sm.nonparametric.KDEMultivariate
. It supports least squares and maximum likelihood cross-validation for bandwidth estimation, as well as mixed continuous, ordered, and unordered categorical data. Conditional density estimation is also available via sm.nonparametric.KDEMUltivariateConditional
.
nonparametric Regression (GSoC 2012 project)¶
Kernel regression models are now available via sm.nonparametric.KernelReg
. It is based on the product kernel mentioned above, so it also has the same set of features including support for cross-validation as well as support for estimation mixed continuous and categorical variables. Censored kernel regression is also provided by kernel_regression.KernelCensoredReg.
Quantile Regression Model¶
Quantile regression is supported via the sm.QuantReg
class. Kernel and bandwidth selection options are available for estimating the asymptotic covariance matrix using a kernel density estimator.
Negative Binomial Regression Model¶
It is now possible to fit negative binomial models for count data via maximum-likelihood using the sm.NegativeBinomial
class. NB1
, NB2
, and geometric
variance specifications are available.
l1-penalized Discrete Choice Models¶
A new optimization method has been added to the discrete models, which includes Logit, Probit, MNLogit and Poisson, that makes it possible to estimate the models with an l1, linear, penalization. This shrinks parameters towards zero and can set parameters that are not very different from zero to zero. This is especially useful if there are a large number of explanatory variables and a large associated number of parameters. CVXOPT is now an optional dependency that can be used for fitting these models.
New and Improved Graphics¶
ProbPlot: A new ProbPlot object has been added to provide a simple interface to create P-P, Q-Q, and probability plots with options to fit a distribution and show various reference lines. In the case of Q-Q and P-P plots, two different samples can be compared with the other keyword argument.
sm.graphics.ProbPlot
import numpy as np
import statsmodels.api as sm
x = np.random.normal(loc=1.12, scale=0.25, size=37)
y = np.random.normal(loc=0.75, scale=0.45, size=37)
ppx = sm.ProbPlot(x)
ppy = sm.ProbPlot(y)
fig1 = ppx.qqplot()
fig2 = ppx.qqplot(other=ppy)
Mosaic Plot: Create a mosaic plot from a contingency table. This allows you to visualize multivariate categorical data in a rigorous and informative way. Available with
sm.graphics.mosaic
.Interaction Plot: Interaction plots now handle categorical factors as well as other improvements.
sm.graphics.interaction_plot
.Regression Plots: The regression plots have been refactored and improved. They can now handle pandas objects and regression results instances appropriately. See
sm.graphics.plot_fit
,sm.graphics.plot_regress_exog
,sm.graphics.plot_partregress
,sm.graphics.plot_ccpr
,sm.graphics.abline_plot
,sm.graphics.influence_plot
, andsm.graphics.plot_leverage_resid2
.
Power and Sample Size Calculations¶
The power module (statsmodels.stats.power
) currently implements power and sample size calculations for the t-tests (sm.stats.TTestPower
, sm.stats.TTestIndPower
), normal based test (sm.stats.NormIndPower
), F-tests (sm.stats.FTestPower
, :class:sm.stats.FTestAnovaPower <FTestAnovaPower>) and Chisquare goodness of fit (sm.stats.GofChisquarePower
) test. The implementation is class based, but the module also provides three shortcut functions, sm.stats.tt_solve_power
, sm.stats.tt_ind_solve_power
and sm.stats.zt_ind_solve_power
to solve for any one of the parameters of the power equations. See this blog post for a more in-depth description of the additions.
Other important new features¶
IPython notebook examples: Many of our examples have been converted or added as IPython notebooks now. They are available here.
Improved marginal effects for discrete choice models: Expanded options for obtaining marginal effects after the estimation of nonlinear discrete choice models are available. See
get_margeff
.OLS influence outlier measures: After the estimation of a model with OLS, the common set of influence and outlier measures and a outlier test are now available attached as methods
get_influnce
andoutlier_test
to the Results instance. SeeOLSInfluence
andoutlier_test
.New datasets: New datasets are available for examples.
Access to R datasets: We now have access to many of the same datasets available to R users through the Rdatasets project. You can access these using the
sm.datasets.get_rdataset
function. This function also includes caching of these datasets.Improved numerical differentiation tools: Numerical differentiation routines have been greatly improved and expanded to cover all the routines discussed in:
Ridout, M.S. (2009) Statistical applications of the complex-step method of numerical differentiation. The American Statistician, 63, 66-74
See the sm.tools.numdiff module.
Consistent constant handling across models: Result statistics no longer rely on the assumption that a constant is present in the model.
Missing value handling across models: Users can now control what models do in the presence of missing values via the
missing
keyword available in the instantiation of every model. The options are'none'
,'drop'
, and'raise'
. The default is'none'
, which does no missing value checks. To drop missing values use'drop'
. And'raise'
will raise an error in the presence of any missing data.
Ability to write Stata datasets: Added the ability to write Stata
.dta
files. Seesm.iolib.StataWriter
.
ARIMA modeling: statsmodels now has support for fitting Autoregressive Integrated Moving Average (ARIMA) models. See
ARIMA
andARIMAResults
for more information.Support for dynamic prediction in AR(I)MA models: It is now possible to obtain dynamic in-sample forecast values in
ARMA
andARIMA
models.Improved Pandas integration: statsmodels now supports all frequencies available in pandas for time-series modeling. These are used for intelligent dates handling for prediction. These features are available, if you pass a pandas Series or DataFrame with a DatetimeIndex to a time-series model.
New statistical hypothesis tests: Added statistics for calculating interrater agreement including Cohen’s kappa and Fleiss’ kappa (See Interrater Reliability and Agreement), statistics and hypothesis tests for proportions (See proportion stats), Tukey HSD (with plot) was added as an enhancement to the multiple comparison tests (
sm.stats.multicomp.MultiComparison
,sm.stats.multicomp.pairwise_tukeyhsd
). Weighted statistics and t tests were enhanced with new options. Tests of equivalence for one sample and two independent or paired samples were added based on t tests and z tests (See Basic Statistics and t-Tests with frequency weights).
Major Bugs fixed¶
Post-estimation statistics for weighted least squares that depended on the centered total sum of squares were not correct. These are now correct and tested. See Issue #501.
Regression through the origin models now correctly use uncentered total sum of squares in post-estimation statistics. This affected the \(R^2\) value in linear models without a constant. See Issue #27.
Backwards incompatible changes and deprecations¶
Cython code is now non-optional. You will need a C compiler to build from source. If building from github and not a source release, you will also need Cython installed. See the installation documentation.
The
q_matrix
keyword to t_test and f_test for linear models is deprecated. You can now specify linear hypotheses using formulas.
The
conf_int
keyword tosm.tsa.acf
is deprecated.The
names
argument is deprecated insm.tsa.VAR
and sm.tsa.SVAR <vector_ar.svar_model.SVAR>. This is now automatically detected and handled.
The
order
keyword tosm.tsa.ARMA.fit
is deprecated. It is now passed in during model instantiation.
The empirical distribution function (
sm.distributions.ECDF
) and supporting functions have been moved tostatsmodels.distributions
. Their old paths have been deprecated.The
margeff
method of the discrete choice models has been deprecated. Useget_margeff
instead. See above. Also, the vagueresid
attribute of the discrete choice models has been deprecated in favor of the more descriptiveresid_dev
to indicate that they are deviance residuals.
The class
KDE
has been deprecated and renamed toKDEUnivariate
to distinguish it from the newKDEMultivariate
. See above.
Development summary and credits¶
The previous version (statsmodels 0.4.3) was released on July 2, 2012. Since then we have closed a total of 380 issues, 172 pull requests and 208 regular issues. The detailed list can be viewed.
This release is a result of the work of the following 38 authors who contributed total of 2032 commits. If for any reason, we have failed to list your name in the below, please contact us:
Ana Martinez Pardo <anamartinezpardo-at-gmail.com>
anov <novikova.go.zoom-at-gmail.com>
avishaylivne <avishay.livne-at-gmail.com>
Bruno Rodrigues <rodrigues.bruno-at-aquitania.org>
Carl Vogel <carljv-at-gmail.com>
Chad Fulton <chad-at-chadfulton.com>
Christian Prinoth <christian-at-prinoth.name>
Daniel B. Smith <neuromathdan-at-gmail.com>
dengemann <denis.engemann-at-gmail.com>
Dieter Vandenbussche <dvandenbussche-at-axioma.com>
Dougal Sutherland <dougal-at-gmail.com>
Enrico Giampieri <enrico.giampieri-at-unibo.it>
evelynmitchell <efm-github-at-linsomniac.com>
George Panterov <econgpanterov-at-gmail.com>
Grayson <graysonbadgley-at-gmail.com>
Jan Schulz <jasc-at-gmx.net>
Josef Perktold <josef.pktd-at-gmail.com>
Jeff Reback <jeff-at-reback.net>
Justin Grana <jg3705a-at-student.american.edu>
langmore <ianlangmore-at-gmail.com>
Matthew Brett <matthew.brett-at-gmail.com>
Nathaniel J. Smith <njs-at-pobox.com>
otterb <itoi-at-live.com>
padarn <padarn-at-wilsonp.anu.edu.au>
Paul Hobson <pmhobson-at-gmail.com>
Pietro Battiston <me-at-pietrobattiston.it>
Ralf Gommers <ralf.gommers-at-googlemail.com>
Richard T. Guy <richardtguy84-at-gmail.com>
Robert Cimrman <cimrman3-at-ntc.zcu.cz>
Skipper Seabold <jsseabold-at-gmail.com>
Thomas Haslwanter <thomas.haslwanter-at-fh-linz.at>
timmie <timmichelsen-at-gmx-topmail.de>
Tom Augspurger <thomas-augspurger-at-uiowa.edu>
Trent Hauck <trent.hauck-at-gmail.com>
tylerhartley <tyleha-at-gmail.com>
Vincent Arel-Bundock <varel-at-umich.edu>
VirgileFritsch <virgile.fritsch-at-gmail.com>
Zhenya <evgeni-at-burovski.me>
Note
Obtained by running git log v0.4.3..HEAD --format='* %aN <%aE>' | sed 's/@/\-at\-/' | sed 's/<>//' | sort -u
.
Issues closed in the 0.5.0 development cycle¶
Issued closed in 0.5.0¶
GitHub stats for release 0.5.0 (07/02/2012/ - 08/14/2013/).
We closed a total of 380 issues, 172 pull requests and 208 regular issues. This is the full list (generated with the script tools/github_stats.py
):
This list is automatically generated, and may be incomplete:
Pull Requests (172):
PR #1015: DOC: Bump version. Remove done tasks.
PR #1010: DOC/RLS: Update release notes workflow. Help Needed!
PR #1014: DOC: nbgenerate does not like the comment at end of line.
PR #1012: DOC: Add link to notebook and crosslink ref. Closes #924.
PR #997: misc, tests, diagnostic
PR #1009: MAINT: Add .mailmap file.
PR #817: Add 3 new unit tests for arima_process
PR #1001: BUG include_package_data for install closes #907
PR #1005: GITHUB: Contributing guidelines
PR #1007: Cleanup docs for release
PR #1003: BUG: Workaround for bug in sphinx 1.1.3. See #1002.
PR #1004: DOC: Update maintainer notes with branching instructions.
PR #1000: BUG: Support pandas 0.8.0.
PR #996: BUG: Handle combo of pandas 0.8.0 and dateutils 1.5.0
PR #995: ENH: Print dateutil version.
PR #994: ENH: Fail gracefully for version not found.
PR #993: More conservative error catching in TimeSeriesModel
PR #992: Misc fixes 12: adjustments to unit test
PR #985: MAINT: Print versions script.
PR #986: ENH: Prefer to_offset to get_offset. Closes #964.
PR #984: COMPAT: Pandas 0.8.1 compatibility. Closes #983.
PR #982: Misc fixes 11
PR #978: TST: generic mle pareto disable bsejac tests with estimated loc
PR #977: BUG python 3.3 fix for numpy str TypeError, see #633
PR #975: Misc fixes 10 numdiff
PR #970: BUG: array too long, raises exception with newer numpy closes #967
PR #965: Vincent summary2 rebased
PR #933: Update and improve GenericlikelihoodModel and miscmodels
PR #950: BUG/REF mcnemar fix exact pvalue, allow table as input
PR #951: Pylint emplike formula genmod
PR #956: Fix a docstring in KDEMultivariateConditional.
PR #949: BUG fix lowess sort when nans closes #946
PR #932: ENH: support basinhopping solver in LikelihoodModel.fit()
PR #927: DOC: clearer minimal example
PR #919: OLS summary crash
PR #918: Fixes10 emplike lowess
PR #909: Bugs in GLM pvalues, more tests, pylint
PR #906: ENH: No fmax with Windows SDK so define inline.
PR #905: MAINT more fixes
PR #898: Misc fixes 7
PR #896: Quantreg rebase2
PR #895: Fixes issue #832
PR #893: ENH: Remove unneeded restriction on low. Closes #867.
PR #894: MAINT: Remove broken function. Keep deprecation. Closes #781.
PR #856: Carljv improved lowess rebased2
PR #884: Pyflakes cleanup
PR #887: BUG: Fix kde caching
PR #883: Fixed pyflakes issue in discrete module
PR #882: Update predstd.py
PR #871: Update of sandbox doc
PR #631: WIP: Correlation positive semi definite
PR #857: BLD: apt get dependencies from Neurodebian, whitespace cleanup
PR #855: AnaMP issue 783 mixture rvs tests rebased
PR #854: Enrico multinear rebased
PR #849: Tyler tukeyhsd rebased
PR #848: BLD TravisCI use python-dateutil package
PR #784: Misc07 cleanup multipletesting and proportions
PR #841: ENH: Add load function to main API. Closes #840.
PR #820: Ensure that tuples are not considered as data, not as data containers
PR #822: DOC: Update for Cython changes.
PR #765: Fix build issues
PR #800: Automatically generate output from notebooks
PR #802: BUG: Use two- not one-sided t-test in t_test. Closes #740.
PR #806: ENH: Import formula.api in statsmodels.api namespace.
PR #803: ENH: Fix arima error message for bad start_params
PR #801: DOC: Fix ANOVA section titles
PR #795: Negative Binomial Rebased
PR #787: Origintests
PR #794: ENH: Allow pandas-in/pandas-out in tsa.filters
PR #791: Github stats for release notes
PR #779: added np.asarray call to durbin_watson in stattools
PR #772: Anova docs
PR #776: BUG: Fix dates_from_range with length. Closes #775.
PR #774: BUG: Attach prediction start date in AR. Closes #773.
PR #767: MAINT: Remove use of deprecated from examples and docs.
PR #762: ENH: Add new residuals to wrapper
PR #754: Fix arima predict
PR #760: ENH: Adjust for k_trend in information criteria. Closes #324.
PR #761: ENH: Fixes and tests sign_test. Closes #642.
PR #759: Fix 236
PR #758: DOC: Update VAR docs. Closes #537.
PR #752: Discrete cleanup
PR #750: VAR with 1d array
PR #748: Remove reference to new_t_test and new_f_test.
PR #739: DOC: Remove outdated note in docstring
PR #732: BLD: Check for patsy dependency at build time + docs
PR #731: Handle wrapped
PR #730: Fix opt fulloutput
PR #729: Get rid of warnings in docs build
PR #698: update url for hsb2 dataset
PR #727: DOC: Fix indent and add missing params to linear models. Closes #709.
PR #726: CLN: Remove unused method. Closes #694
PR #725: BUG: Should call anova_single. Closes #702.
PR #723: Rootfinding for Power
PR #722: Handle pandas.Series with names in make_lags
PR #714: Fix 712
PR #668: Allow for any pandas frequency to be used in TimeSeriesModel.
PR #711: Misc06 - bug fixes
PR #708: BUG: Fix one regressor case for conf_int. Closes #706.
PR #700: Bugs rebased
PR #680: BUG: Swap arguments in fftconvolve for scipy >= 0.12.0
PR #640: Misc fixes 05
PR #663: a typo in runs.py doc string for mcnemar test
PR #652: WIP: fixing pyflakes / pep8, trying to improve readability
PR #619: DOC: intro to formulas
PR #648: BF: Make RLM stick to Huber’s description
PR #649: Bug Fix
PR #637: Pyflakes cleanup
PR #634: VAR DOC typo
PR #623: Slowtests
PR #621: MAINT: in setup.py, only catch ImportError for pandas.
PR #590: Cleanup test output
PR #591: Interrater agreement and reliability measures
PR #618: Docs fix the main warnings and errors during sphinx build
PR #610: nonparametric examples and some fixes
PR #578: Fix 577
PR #575: MNT: Remove deprecated scikits namespace
PR #499: WIP: Handle constant
PR #567: Remove deprecated
PR #571: Dataset docs
PR #561: Grab rdatasets
PR #570: DOC: Fixed links to Rdatasets
PR #524: DOC: Clean up discrete model documentation.
PR #506: ENH: Re-use effects if model fit with QR
PR #556: WIP: L1 doc fix
PR #564: TST: Use native integer to avoid issues in dtype asserts
PR #543: Travis CI using M.Brett nipy hack
PR #558: Plot cleanup
PR #541: Replace pandas DataMatrix with DataFrame
PR #534: Stata test fixes
PR #532: Compat 323
PR #531: DOC: Add ECDF to distributions docs
PR #526: ENH: Add class to write Stata binary dta files
PR #521: DOC: Add abline plot to docs
PR #518: Small fixes: interaction_plot
PR #508: ENH: Avoid taking cholesky decomposition of diagonal matrix
PR #509: DOC: Add ARIMA to docs
PR #510: DOC: realdpi is disposable personal income. Closes #394.
PR #507: ENH: Protect numdifftools import. Closes #45
PR #504: Fix weights
PR #498: DOC: Add patys requirement to install docs
PR #491: Make _data a public attribute.
PR #494: DOC: Fix pandas links
PR #492: added intersphinx for pandas
PR #422: Handle missing data
PR #485: ENH: Improve error message for pandas objects without dates in index
PR #428: Remove other data
PR #483: Arima predict bug
PR #482: TST: Do array-array comparison when using numpy.testing
PR #471: Formula rename df -> data
PR #473: Vincent docs tweak rebased
PR #468: Docs 050
PR #462: El aft rebased
PR #461: TST: numpy 1.5.1 compatibility
PR #460: Emplike desc reg rebase
PR #410: Discrete model marginal effects
PR #417: Numdiff cleanup
PR #398: Improved plot_corr and plot_corr_grid functions.
PR #401: BUG: Finish refactoring margeff for dummy. Closes #399.
PR #400: MAINT: remove lowess.py, which was kept in 0.4.x for backwards compatibi…
PR #371: BF+TEST: fixes, checks and tests for isestimable
PR #351: ENH: Copy diagonal before write for upcoming numpy changes
PR #384: REF: Move mixture_rvs out of sandbox.
PR #368: ENH: Add polished version of acf/pacf plots with confidence intervals
PR #378: Infer freq
PR #374: ENH: Add Fair’s extramarital affair dataset. From tobit-model branch.
PR #358: ENH: Add method to OLSResults for outlier detection
PR #369: ENH: allow predict to pass through patsy for transforms
PR #352: Formula integration rebased
PR #360: REF: Deprecate order in fit and move to ARMA init
PR #366: Version fixes
PR #359: DOC: Fix sphinx warnings
Issues (208):
Issue #1036: Series no longer inherits from ndarray
Issue #1038: DataFrame with integer names not handled in ARIMA
Issue #1028: Test fail with windows and Anaconda - Low priority
Issue #676: acorr_breush_godfrey undefined nlags
Issue #922: lowess returns inconsistent with option
Issue #425: no bse in robust with norm=TrimmedMean
Issue #1025: add_constant incorrectly detects constant column
Issue #533: py3 compatibility
pandas.read_csv(urlopen(...))
Issue #662: doc: install instruction: explicit about removing scikits.statsmodels
Issue #910: test failure Ubuntu TestARMLEConstant.test_dynamic_predict
Issue #80: t_model: f_test, t_test do not work
Issue #432: GenericLikelihoodModel change default for score and hessian
Issue #454: BUG/ENH: HuberScale instance is not used, allow user defined scale estimator
Issue #98: check connection or connect summary to variable names in wrappers
Issue #418: BUG: MNLogit loglikeobs, jac
Issue #1017: nosetests warnings
Issue #924: DOCS link in notebooks to notebook for download
Issue #1011: power ttest endless loop possible
Issue #907: BLD data_files for stats.libqsturng
Issue #328: consider moving example scripts into IPython notebooks
Issue #1002: Docs will not build with Sphinx 1.1.3
Issue #69: Make methods like compare_ftest work with wrappers
Issue #503: summary_old in RegressionResults
Issue #991: TST precision of normal_power
Issue #945: Installing statsmodels from github?
Issue #964: Prefer to_offset not get_offset in tsa stuff
Issue #983: bug: pandas 0.8.1 incompatibility
Issue #899: build_ext inplace does not cythonize
Issue #923: location of initialization code
Issue #980: auto lag selection in S_hac_simple
Issue #968: genericMLE Ubuntu test failure
Issue #633: python 3.3 compatibility
Issue #728: test failure for solve_power with fsolve
Issue #971: numdiff test cases
Issue #976: VAR Model does not work in 1D
Issue #972: numdiff: epsilon has no minimum value
Issue #967: lowes test failure Ubuntu
Issue #948: nonparametric tests: mcnemar, cochranq unit test
Issue #963: BUG in runstest_2sample
Issue #946: Issue with lowess() smoother in statsmodels
Issue #868: k_vars > nobs
Issue #917: emplike emplikeAFT stray dimensions
Issue #264: version comparisons need to be made more robust (may be just use LooseVersion)
Issue #674: failure in test_foreign, pandas testing
Issue #828: GLMResults inconsistent distribution in pvalues
Issue #908: RLM missing test for tvalues, pvalues
Issue #463: formulas missing in docs
Issue #256: discrete Nbin has zero test coverage
Issue #831: test errors running bdist
Issue #733: Docs: interrater cohens_kappa is missing
Issue #897: lowess failure - sometimes
Issue #902: test failure tsa.filters precision too high
Issue #901: test failure stata_writer_pandas, newer versions of pandas
Issue #900: ARIMA.__new__ errors on python 3.3
Issue #832: notebook errors
Issue #867: Baxter King has unneeded limit on value for low?
Issue #781: discreteResults margeff method not tests, obsolete
Issue #870: discrete unit tests duplicates
Issue #630: problems in regression plots
Issue #885: Caching behavior for KDEUnivariate icdf
Issue #869: sm.tsa.ARMA(…, order=(p,q)) gives “__init__() got an unexpected keyword argument ‘order’” error
Issue #783: statsmodels.distributions.mixture_rvs.py no unit tests
Issue #824: Multicomparison w/Pandas Series
Issue #789: presentation of multiple comparison results
Issue #764: BUG: multipletests incorrect reject for Holm-Sidak
Issue #766: multipletests - status and tests of 2step FDR procedures
Issue #763: Bug: multipletests raises exception with empty array
Issue #840: sm.load should be in the main API namespace
Issue #830: invalid version number
Issue #821: Fail gracefully when extensions are not built
Issue #204: Cython extensions built twice?
Issue #689: tutorial notebooks
Issue #740: why does t_test return one-sided p-value
Issue #804: What goes in statsmodels.formula.api?
Issue #675: Improve error message for ARMA SVD convergence failure.
Issue #15: arma singular matrix
Issue #559: Add Rdatasets to optional dependencies list
Issue #796: Prediction Standard Errors
Issue #793: filters are not pandas aware
Issue #785: Negative R-squared
Issue #777: OLS residuals returned as Pandas series when endog and exog are Pandas series
Issue #770: Add ANOVA to docs
Issue #775: Bug in dates_from_range
Issue #773: AR model pvalues error with Pandas
Issue #768: multipletests: numerical problems at threshold
Issue #355: add draw if interactive to plotting functions
Issue #625: Exog is not correctly handled in ARIMA predict
Issue #626: ARIMA summary does not print exogenous variable coefficients
Issue #657: order (0,1) breaks ARMA forecast
Issue #736: ARIMA predict problem for ARMA model
Issue #324: ic in ARResults, aic, bic, hqic, fpe inconsistent definition?
Issue #642: sign_test check
Issue #236: AR start_params broken
Issue #235: tests hang on Windows
Issue #156: matplotlib deprecated legend ? var plots
Issue #331: Remove stale tests
Issue #592: test failures in datetools
Issue #537: Var Models
Issue #755: Unable to access AR fit parameters when model is estimated with pandas.DataFrame
Issue #670: discrete: numerically useless clipping
Issue #515: MNLogit residuals raise a TypeError
Issue #225: discrete models only define deviance residuals
Issue #594: remove skiptest in TestProbitCG
Issue #681: Dimension Error in discrete_model.py When Running test_dummy_*
Issue #744: DOC: new_f_test
Issue #549: Ship released patsy source in statsmodels
Issue #588: patsy is a hard dependency?
Issue #716: Tests missing for functions if pandas is used
Issue #715: statsmodels regression plots not working with pandas datatypes
Issue #450: BUG: full_output in optimizers Likelihood model
Issue #709: DOCstrings linear models do not have missing params
Issue #370: BUG weightstats has wrong cov
Issue #694: DiscreteMargins duplicate method
Issue #702: bug, pylint stats.anova
Issue #423: Handling of constant across models
Issue #456: BUG: ARMA date handling incompatibility with recent pandas
Issue #514: NaNs in Multinomial
Issue #405: Check for existing old version of scikits.statsmodels?
Issue #586: Segmentation fault with OLS
Issue #721: Unable to run AR on named time series objects
Issue #125: caching pinv_wexog broke iterative fit - GLSAR
Issue #712: TSA bug with frequency inference
Issue #319: Timeseries Frequencies
Issue #707: .summary with alpha ignores parsed value
Issue #673: nonparametric: bug in _kernel_base
Issue #710: test_power failures
Issue #706: .conf_int() fails on linear regression without intercept
Issue #679: Test Baxter King band-pass filter fails with scipy 0.12 beta1
Issue #552: influence outliers breaks when regressing on constant
Issue #639: test folders not on python path
Issue #565: omni_normtest does not propagate the axis argument
Issue #563: error in doc generation for AR.fit
Issue #109: TestProbitCG failure on Ubuntu
Issue #661: from scipy import comb fails on the latest scipy 0.11.0
Issue #413: DOC: example_discrete.py missing from 0.5 documentation
Issue #644: FIX: factor plot + examples broken
Issue #645: STY: pep8 violations in many examples
Issue #173: doc sphinx warnings
Issue #601: bspline.py dependency on old scipy.stats.models
Issue #103: ecdf and step function conventions
Issue #18: Newey-West sandwich covariance is missing
Issue #279: cov_nw_panel not tests, example broken
Issue #150: precision in test_discrete.TestPoissonNewton.test_jac ?
Issue #480: rescale loglike for optimization
Issue #627: Travis-CI support for scipy
Issue #622: mark tests as slow in emplike
Issue #589: OLS F-statistic error
Issue #572: statsmodels/tools/data.py Stuck looking for la.py
Issue #580: test errors in graphics
Issue #577: PatsyData detection buglet
Issue #470: remove deprecated features
Issue #573: lazy imports are (possibly) very slow
Issue #438: New results instances are not in online documentation
Issue #542: Regression plots fail when Series objects passed to sm.OLS
Issue #239: release 0.4.x
Issue #530: l1 docs issues
Issue #539: test for statawriter (failure)
Issue #490: Travis CI on PRs
Issue #252: doc: distributions.rst refers to sandbox only
Issue #85: release 0.4
Issue #65: MLE fit of AR model has no tests
Issue #522:
test
does not propagate arguments to noseIssue #517: missing array conversion or shape in linear model
Issue #523: test failure with ubuntu decimals too large
Issue #520: web site documentation, source not updated
Issue #488: Avoid cholesky decomposition of diagonal matrices in linear regression models
Issue #394: Definition in macrodata NOTE
Issue #45: numdifftools dependency
Issue #501: WLS/GLS post estimation results
Issue #500: WLS fails if weights is a pandas.Series
Issue #27: add hasconstant indicator for R-squared and df calculations
Issue #497: DOC: add patsy?
Issue #495: ENH: add footer SimpleTable
Issue #402: model._data -> model.data?
Issue #477: VAR NaN Bug
Issue #421: Enhancement: Handle Missing Data
Issue #489: Expose model._data as model.data
Issue #315: tsa models assume pandas object indices are dates
Issue #440: arima predict is broken for steps > q and q != 1
Issue #458: TST BUG? comparing pandas and array in tests, formula
Issue #464: from_formula signature
Issue #245: examples in docs: make nicer
Issue #466: broken example, pandas
Issue #57: Unhelpful error from bad exog matrix in model.py
Issue #271: ARMA.geterrors requires model to be fit
Issue #350: Writing to array returned np.diag
Issue #354: example_rst does not copy unchanged files over
Issue #467: Install issues with Pandas
Issue #444: ARMA example on stable release website not working
Issue #377: marginal effects count and discrete adjustments
Issue #426: “svd” method not supported for OLS.fit()
Issue #409: Move numdiff out of the sandbox
Issue #416: Switch to complex-step Hessian for AR(I)MA
Issue #415: bug in kalman_loglike_complex
Issue #397: plot_corr axis text labeling not working (with fix)
Issue #399: discrete errors due to incorrect in-place operation
Issue #389: VAR test_normality is broken with KeyError
Issue #388: Add tsaplots to graphics.api as graphics.tsa
Issue #387: predict date was not getting set with start = None
Issue #386: p-values not returned from acf
Issue #385: Allow AR.select_order to work without model being fit
Issue #383: Move mixture_rvs out of sandbox.
Issue #248: ARMA breaks with a 1d exog
Issue #273: When to give order for AR/AR(I)MA
Issue #363: examples folder -> tutorials folder
Issue #346: docs in sitepackages
Issue #353: PACF docs raise a sphinx warning
Issue #348: python 3.2.3 test failure zip_longest