Releases: matthewwardrop/formulaic
v0.3.0
This is a relatively massive release bringing lots of new features and bugfixes. All users are encouraged to upgrade. The full changelog can be viewed in the documentation website. The highlights include:
Breaking changes:
- The minimum supported version of Python is now 3.7 (up from 3.6).
- Moved transform implementations from
formulaic.materializers.transforms
to
the top-levelformulaic.transforms
module, and ported all existing
transforms to outputFactorValues
types rather than dictionaries.
FactorValues
is an object proxy that allows output types like
pandas.DataFrame
s to be used as they normally would, with some additional
metadata for formulaic accessible via the__formulaic_metadata__
attribute. This makes non-formula direct usage of these transforms much more
pleasant. ~
is no longer a generic formula separator, and can only be used once in a
formula. Please use the newly added|
operator to separate a formula into
multiple parts.
New features and enhancements:
- Added support for "structured" formulas, and updated the
~
operator to use
them. Structured formulas can have named substructures, for example:lhs
andrhs
for the~
operator. The representation of formulas has been
updated to show this structure. - Added support for context-sensitivity during the resolution of operators,
allowing more flexible operators to be implemented (this is exploited by the
|
operator which splits formulas into multiple parts). - The
formulaic.model_matrix
syntactic sugar function now acceptsModelSpec
andModelMatrix
instances as the "formula" spec, making generation of
matrices with the same form as previously generated matrices more
convenient. - Added the
poly
transform (compatible with R and patsy). numpy
is now always available in formulas vianp
, allowing formulas like
np.sum(x)
. For convenience,log
,log10
,log2
,exp
,exp10
and
exp2
are now exposed as transforms independent of user context.- Pickleability is now guaranteed and tested via unit tests. Failure to pickle
any formulaic metadata object (such as formulas, model specs, etc) is
considered a bug. - The capturing of user context for use in formula materialization has been
split out into a utility methodformulaic.utils.context.capture_context()
.
This can be used by libraries that wrap Formulaic to capture the variables
and/or transforms available in a users' environment where appropriate.
v0.2.4
This is a minor release that fixes an issue whereby the ModelSpec
instances attached to ModelMatrix
objects would keep reference to the original data, greatly inflating the size of the ModelSpec
. (Many thanks to @mmacpherson for reporting!)
v0.2.3
This release is identical to v0.2.2, except that the source distribution now includes the docs, license, and tox configuration.
v0.2.2
This is a minor release with one bugfix.
- Fix pandas model matrix outputs when constants are generated as part of model matrix construction and the incoming dataframe has a custom rather than range index. (issue noted by @CamDavidsonPilon)
v0.2.1
This is a minor patch release that brings in some valuable improvements.
- Keep track of the pandas dataframe index if outputting a pandas
DataFrame
. (issue noted by @CamDavidsonPilon) - Fix using functions in formulae that are nested within a module or class. (issue noted by @bashtage)
- Avoid crashing when an attempt is made to generate an empty model matrix. (PR from @bashtage)
- Enriched setup.py with long description for a better experience on PyPI. (PR from @bashtage)
v0.2.0
This is major release that brings in a large number of improvements. The API is still not finalised (that will happen with the 1.0) release, and so be careful of API breakages brought in by subsequent major releases (e.g. v0.3.x). Many thanks to those who have contributed by testing or code submissions (including @bashtage, @griffiri, and @CamDavidsonPilon).
New Features and Improvements:
- Enriched formula parser to support quoting, and evaluation of formulas involving fields with invalid Python names.
- Added commonly used stateful transformations (identity, center, scale, bs)
- Improved the helpfulness of error messages reported by the formula parser.
- Added support for basic calculus on formulas (useful when taking the gradient of linear models).
- Made it easier to extend Formulaic with additional materializers.
- Many internal improvements to code quality and reliability, including 100% test coverage.
- Added benchmarks for Formulaic against R and patsy.
- Added documentation.
- Miscellaneous other bugfixes and cleanups.