feets package¶
Subpackages¶
- feets.datasets package
- feets.extractors package
- Submodules
- feets.extractors.ext_amplitude module
- feets.extractors.ext_anderson_darling module
- feets.extractors.ext_astropy_lomb_scargle module
- feets.extractors.ext_autocor_length module
- feets.extractors.ext_bazin_fit module
- feets.extractors.ext_beyond_n_std module
- feets.extractors.ext_car module
- feets.extractors.ext_color module
- feets.extractors.ext_con module
- feets.extractors.ext_cusum module
- feets.extractors.ext_dmdt module
- feets.extractors.ext_duration module
- feets.extractors.ext_eta module
- feets.extractors.ext_eta_color module
- feets.extractors.ext_eta_e module
- feets.extractors.ext_excess_variance module
- feets.extractors.ext_fourier_components module
- feets.extractors.ext_gskew module
- feets.extractors.ext_inter_percentile_range module
- feets.extractors.ext_light_curve_lomb_scargle module
- feets.extractors.ext_linear_fit module
- feets.extractors.ext_linear_trend module
- feets.extractors.ext_linexp_fit module
- feets.extractors.ext_max_slope module
- feets.extractors.ext_max_time_interval module
- feets.extractors.ext_mean module
- feets.extractors.ext_mean_variance module
- feets.extractors.ext_median module
- feets.extractors.ext_median_abs_dev module
- feets.extractors.ext_median_amplitude module
- feets.extractors.ext_median_brp module
- feets.extractors.ext_min_time_interval module
- feets.extractors.ext_otsu_split module
- feets.extractors.ext_pair_slope_trend module
- feets.extractors.ext_percent_amplitude module
- feets.extractors.ext_percent_diff_percentile module
- feets.extractors.ext_percentage_ratio module
- feets.extractors.ext_q31 module
- feets.extractors.ext_rcs module
- feets.extractors.ext_reduced_chi2 module
- feets.extractors.ext_roms module
- feets.extractors.ext_signature module
- feets.extractors.ext_skew module
- feets.extractors.ext_slotted_a_length module
- feets.extractors.ext_small_kurtosis module
- feets.extractors.ext_std module
- feets.extractors.ext_stetson module
- feets.extractors.ext_stetson_k module
- feets.extractors.ext_structure_functions module
- feets.extractors.ext_time_mean module
- feets.extractors.ext_time_std module
- feets.extractors.ext_villar_fit module
- feets.extractors.ext_weighted_beyond_n_std module
- feets.extractors.ext_weighted_mean module
- feets.extractors.extractor module
ExtractorExtractor.extract()Extractor.flatten_feature()Extractor.get_data()Extractor.get_default_params()Extractor.get_dependencies()Extractor.get_features()Extractor.get_optional_data()Extractor.get_required_data()Extractor.paramsExtractor.prepare_extract()Extractor.to_dict()Extractor.validate_extract()Extractor.validate_flatten()
ExtractorBadDefinedErrorExtractorValidationErrorExtractorWarningFeatureExtractionWarningextractor_warning()feature_warning()
- feets.extractors.light_curve_extractor module
- feets.extractors.registry module
EntityNotFoundErrorExtractorRegistryExtractorRegistry.extractor_of()ExtractorRegistry.extractors_from_data()ExtractorRegistry.extractors_from_features()ExtractorRegistry.get_execution_plan()ExtractorRegistry.is_extractor_registered()ExtractorRegistry.is_feature_registered()ExtractorRegistry.register_extractor()ExtractorRegistry.registered_extractorsExtractorRegistry.registered_featuresExtractorRegistry.sort_extractors_by_dependencies()ExtractorRegistry.unregister_extractor()ExtractorRegistry.validate_is_extractor()
RegistryConflictErrorRegistryErrorRegistryValidationError
- Module contents
ExtractorExtractor.extract()Extractor.flatten_feature()Extractor.get_data()Extractor.get_default_params()Extractor.get_dependencies()Extractor.get_features()Extractor.get_optional_data()Extractor.get_required_data()Extractor.paramsExtractor.prepare_extract()Extractor.to_dict()Extractor.validate_extract()Extractor.validate_flatten()
ExtractorBadDefinedErrorExtractorValidationErrorExtractorWarning
- feets.libs package
Submodules¶
feets.core module¶
Core functionalities of feets.
- class feets.core.FeatureSpace(data=None, only=None, exclude=None, dask_options=None, **kwargs)[source]¶
Bases:
objectClass to select and extract features from a time series.
The FeatureSpace class allows for the extraction of selected features from the available data vectors (e.g., magnitude, time, error, second magnitude) of one or more time series.
The data, only, and exclude filters can be combined to control the selection of features to be extracted. If no filter is provided, the selection will include all the available features.
- Parameters:
- dataarray_like, optional
List of available data vectors to extract from. If provided, only the features that can be computed on some of the selected vectors will be included.
- onlyarray_like, optional
List of features to be extracted. If provided, only the selected features will be included. It must be disjoint with exclude.
- excludearray_like, optional
List of features to be excluded from the extraction. If provided, all features except the selected ones will be included. It must be disjoint with only.
- **kwargs
Additional parameters used to initialize the extractors.
- Attributes:
- featuresfrozenset
The features selected for extraction, based on the provided filters.
extractorsnp.ndarraynp.ndarray: The extractor instances used to compute the features.
required_datafrozensetfrozenset: The data vectors required for the extraction.
- dask_optionsdict
Options to be passed to the Dask scheduler.
- Raises:
- ValueError
If an invalid combination of data, only, and exclude is provided.
See also
feets.FeaturesClass to manage and manipulate feature extraction results.
feets.ExtractorAbstract base class for feature extractors.
dask.computeCompute several dask collections at once.
Examples
Using data filter to specify the available data vectors:
>>> fs = FeatureSpace(data=['magnitude', 'time']) >>> # The resulting `FeatureSpace` will only extract the features that >>> # depend on 'magnitude' and/or 'time'. >>> fs.extract(**lc) <Features feature_names={'Mean', 'Std', 'PeriodLS', 'Signature', ...}, length=1>
Using only filter to select specific features for extraction:
>>> fs = FeatureSpace(only=['Mean', 'Std']) >>> # The resulting `FeatureSpace` will only extract the 'Mean' and 'Std' >>> # features, regardless of the available data vectors. >>> fs.extract(**lc) <Features feature_names={'Mean', 'Std'}, length=1>
Using exclude filter to exclude specific features from extraction:
>>> fs = FeatureSpace(exclude=['Mean', 'Std']) >>> # The resulting `FeatureSpace` will extract all features except for >>> # 'Mean' and 'Std', regardless of the available data vectors. >>> fs.extract(**lc) <Features feature_names={'PeriodLS', 'Signature', ...}, length=1>
Configuring the extractors with additional parameters:
>>> fs = FeatureSpace( ... data=['magnitude', 'time'], ... PeriodLS={'nperiods': 5}, ... Signature={'phase_bins': 20, 'mag_bins': 15} ... ) >>> # The resulting `FeatureSpace` will extract features that depend on >>> # 'magnitude' and 'time', with the specified parameters for the >>> # `PeriodLS` and `Signature` extractors. >>> fs.extract(**lc) <Features feature_names={'Mean', 'Std', 'PeriodLS', 'Signature', ...}, length=1>
- extract(**lc)[source]¶
Extract the selected features from the provided light curve.
- Parameters:
- **lcdict
A light curve represented as a dictionary, mapping data vector names to their values.
- Returns:
- Features
A collection of extracted features of the provided light curves.
See also
feets.FeaturesClass to manage and manipulate feature extraction results.
extract_many
Examples
>>> fs = FeatureSpace(only=['Mean']) >>> fs.extract(magnitude=[1, 2, 3]) Features(feature_names={'Mean'}, length=1)
- extract_many(*lcs)[source]¶
Extract the selected features from the provided light curves.
- Parameters:
- *lcslist of dict
A list of light curves, where each light curve is a dictionary mapping data vector names to their values.
- Returns:
- Features
A collection of extracted features of the provided light curves.
See also
feets.FeaturesClass to manage and manipulate feature extraction results.
extract
Examples
>>> fs = FeatureSpace(only=['Mean']) >>> fs.extract_many({'magnitude': [1, 2, 3]}, {'magnitude': [4, 5, 6]}) Features(feature_names={'Mean'}, length=2)
- property extractors¶
np.ndarray: The extractor instances used to compute the features.
The extractors are ordered according to their dependencies, meaning that the extractors that depend on others come after those they depend on.
- classmethod from_dict(data)[source]¶
Create a FeatureSpace object from a dictionary representation.
- Parameters:
- datadict
A dictionary representation of the FeatureSpace, including the data vectors required for extraction, the selected features, the list of extractors with their parameters, and the Dask options.
- Returns:
- FeatureSpace
A FeatureSpace object configured with the features, required data vectors, dask options, and extractors from the provided dictionary.
See also
- classmethod from_lightcurve(**lc)[source]¶
Create a FeatureSpace for the provided light curve.
The resulting FeatureSpace will be configured to extract only the features that can be computed from the data vectors present in the provided light curve.
- Parameters:
- **lcdict
A light curve represented as a dictionary, mapping data vector names to their values.
- Returns:
- FeatureSpace
A FeatureSpace instance configured for the provided light curve.
See also
Examples
>>> lc = {'magnitude': [1, 2, 3]} >>> fs = FeatureSpace.from_lightcurve(**lc) >>> >>> # The resulting `FeatureSpace` will only extract features that >>> # depend on 'magnitude'. >>> fs.extract(**lc) Features(feature_names={'Mean', 'Std', ...}, length=1)
- classmethod from_lightcurves(*lcs)[source]¶
Create a FeatureSpace for the provided light curves.
This method determines the common data vectors (e.g., ‘magnitude’, ‘time’) present across all provided light curves. It then creates a FeatureSpace configured to extract only the features that can be computed from this common set of data vectors.
- Parameters:
- *lcslist of dict
A list of light curves, where each light curve is a dictionary mapping data vector names to their values.
- Returns:
- FeatureSpace
A FeatureSpace instance configured for the common data vectors.
- Raises:
- ValueError
If no common data vectors are found among the light curves.
See also
Examples
>>> lc1 = {'magnitude': [1, 2, 3]} >>> lc2 = {'time': [0.1, 0.2, 0.3], 'magnitude': [4, 5, 6]} >>> >>> # The common data vector is 'magnitude'. >>> fs = FeatureSpace.from_lightcurves(lc1, lc2) >>> >>> # The resulting `FeatureSpace` will only extract features that >>> # depend on 'magnitude'. >>> fs.extract(**lc1) Features(feature_names={'Mean', 'Std', ...}, length=1)
- property required_data¶
frozenset: The data vectors required for the extraction.
- property selected_features¶
frozenset: The features selected for extraction.
- to_dict()[source]¶
Convert the FeatureSpace object to a dictionary representation.
- Returns:
- dict
A dictionary representation of the FeatureSpace, including the data vectors required for extraction, the selected features, the list of extractors with their parameters, and the Dask options.
- to_json(*, path_or_buffer=None, **kwargs)[source]¶
Serialize the FeatureSpace to a JSON formatted string or file.
- Parameters:
- path_or_bufferstr, pathlib.Path, file-like object or None, optional
The file path or buffer to write the JSON data to. If None, the JSON data is returned as a string. Defaults to None.
- **kwargs
Additional parameters to pass to io.store_json.
- Returns:
- str
The JSON formatted string if path_or_buffer is None.
- to_yaml(*, path_or_buffer=None, **kwargs)[source]¶
Serialize the FeatureSpace to a YAML formatted string or file.
- Parameters:
- path_or_bufferstr, pathlib.Path, file-like object or None, optional
The file path or buffer to write the YAML data to. If None, the JSON data is returned as a string. Defaults to None.
- **kwargs
Additional parameters to pass to io.store_json.
- Returns:
- str
The YAML formatted string if path_or_buffer is None.
feets.features module¶
Manage and manipulate feature extraction results.
- class feets.features.Features(features, extractors)[source]¶
Bases:
SequenceClass to manage and manipulate feature extraction results.
The Features class encapsulates the results of feature extraction performed on multiple light curves. It provides an interface to access the extracted features either by feature name or by light curve index.
- Parameters:
- featuresarray_like of dict
The results of the feature extraction for each of the light curves.
- extractorsarray_like of Extractor
The extractor instances used to compute the features.
- Attributes:
- featuresnp.ndarray
The extracted features by light curve.
- extractorsnp.ndarray
The extractor instances used to compute the features.
feature_namesfrozensetfrozenset: The names of the extracted features.
lengthintint: The number of light curves.
Examples
>>> from feets import FeatureSpace >>> fs = FeatureSpace(only=["Std", "Mean"]) >>> results = fs.extract_many( ... {"magnitude": [1, 1.5, 2]}, ... {"magnitude": [1, 2, 3]} ... ) >>> results <Features feature_names={'Std', 'Mean'}, length=2>
Accessing results by feature name:
>>> results.Mean array([1.5, 2. ]) >>> results.Std array([0.5, 1. ])
Accessing results by light curve index:
>>> results[0] {'Std': np.float64(0.5), 'Mean': np.float64(1.5)} >>> results[1] {'Std': np.float64(1.0), 'Mean': np.float64(2.0)}
- as_frame(**kwargs)[source]¶
Convert the extraction results into a pandas.DataFrame.
This method transforms the extracted features into a pandas.DataFrame, where each row corresponds to a light curve and each column represents a feature.
The conversion process can be parallelized to improve performance on large datasets.
- Parameters:
- **kwargs
Keyword arguments passed to the joblib.Parallel constructor, used when parallel processing the pandas.DataFrame conversion.
- Returns:
- pandas.DataFrame
A pandas.DataFrame representation of the extracted features. Each row corresponds to a light curve and each column represents a feature.
Examples
>>> from feets import FeatureSpace >>> fs = FeatureSpace(only=["Std", "Mean"]) >>> results = fs.extract_many( ... {"magnitude": [1, 1.5, 2]}, ... {"magnitude": [1, 2, 3]} ... ) >>> results.as_frame() Features Std Mean Light Curve 0 0.5 1.5 1 1.0 2.0
- property feature_names¶
frozenset: The names of the extracted features.
- property length¶
int: The number of light curves.
feets.io module¶
Serialize and deserialize feets.FeatureSpace objects.
- class feets.io.CustomJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Bases:
JSONEncoderCustom JSON <https://json.org> encoder for feets.FeatureSpace objects.
This class extends the json.JSONEncoder to add support for the following objects and types:
Python
JSON
tuple, set, frozenset, np.ndarray
array
datetime
string
np.integer, np.floating, np.complexfloating
number
np.true
true
np.false
false
- Attributes:
- CONVERTERSdict
A dictionary mapping data types to their corresponding converter functions.
See also
json.JSONEncoderExtensible JSON https://json.org encoder for Python data structures.
- CONVERTERS = ((<class 'tuple'>, <class 'list'>), (<class 'set'>, <class 'list'>), (<class 'frozenset'>, <class 'list'>), (<class 'datetime.datetime'>, <method 'isoformat' of 'datetime.datetime' objects>), (<class 'numpy.integer'>, <class 'int'>), (<class 'numpy.floating'>, <class 'float'>), (<class 'numpy.complexfloating'>, <class 'complex'>), (<class 'numpy.bool'>, <class 'bool'>), (<class 'numpy.ndarray'>, <method 'tolist' of 'numpy.ndarray' objects>))¶
- default(obj)[source]¶
Serialize an object to a JSON-serializable format.
This method overrides the default method of the json.JSONEncoder class to provide custom serialization for the data structures defined in the CONVERTERS attribute, or calls the base implementation for any other object.
- Returns:
- object
The JSON-serializable representation of the object.
- Raises:
- TypeError
If the object does not match any of the types in CONVERTERS.
- feets.io.none_open_or_buffer(path_or_buffer, mode)[source]¶
Context manager to handle file paths or buffers as file-like objects.
This function provides a unified way to handle file paths, buffers, or in-memory buffers, and yields a file-like object for reading or writing.
- Parameters:
- path_or_bufferstr, pathlib.Path, file-like object or None
- If str or pathlib.Path, the file at this given path is opened
with the specified mode.
If a file-like object, it is yielded directly.
If None, an io.StringIO in-memory buffer is created and yielded.
- modestr
The mode in which to open the file (e.g., ‘r’, ‘w’). This is ignored if path_or_buffer is not a path.
- Yields:
- file-like object
An open, ready-to-use file-like object.
- feets.io.read_json(path_or_buffer)[source]¶
Deserialize a JSON formatted string or file to feets.FeatureSpace.
- Parameters:
- path_or_bufferstr, pathlib.Path or file-like object
The file path, buffer, or stream to read the JSON data from.
- Returns:
- feets.FeatureSpace
A feets.FeatureSpace object containing the deserialized data.
See also
feets.FeatureSpaceClass to select and extract features from a time series.
store_json
- feets.io.read_yaml(path_or_buffer)[source]¶
Deserialize a YAML formatted string or file to feets.FeatureSpace.
- Parameters:
- path_or_bufferstr, pathlib.Path or file-like object
The file path, buffer, or stream to read the YAML data from.
- Returns:
- feets.FeatureSpace
A feets.FeatureSpace object containing the deserialized data.
See also
feets.FeatureSpaceClass to select and extract features from a time series.
store_yaml
- feets.io.store_json(fspace, path_or_buffer=None, **kwargs)[source]¶
Serialize a feets.FeatureSpace to a JSON formatted string or file.
- Parameters:
- fspacefeets.FeatureSpace
The feets.FeatureSpace object to serialize. This object must implement a to_dict method that returns a serializable representation.
- path_or_bufferstr, pathlib.Path, file-like object or None, default=None
The file path, buffer, or stream to write the JSON data to. If None, the JSON data is returned as a string.
- **kwargs
Additional keyword arguments passed to json.dump when serializing the feature space.
- Returns:
- str or None
If path_or_buffer is None, returns a JSON formatted string representing the feature space. Otherwise, writes the JSON data to the specified file or buffer and returns None.
- Raises:
- TypeError
If the provided feature space contains non-serializable objects.
- feets.io.store_yaml(fspace, path_or_buffer=None, **kwargs)[source]¶
Serialize a feets.FeatureSpace to a YAML formatted string or file.
- Parameters:
- fspacefeets.FeatureSpace
The feets.FeatureSpace object to serialize. This object must implement a to_dict method that returns a serializable representation.
- path_or_bufferstr, pathlib.Path, file-like object or None, default=None
The file path, buffer, or stream to write the YAML data to. If None, the YAML data is returned as a string.
- **kwargs
Additional keyword arguments passed to yaml.safe_dump when serializing the feature space.
- Returns:
- str or None
If path_or_buffer is None, returns a YAML formatted string representing the feature space. Otherwise, writes the YAML data to the specified file or buffer and returns None.
- Raises:
- TypeError
If the provided feature space contains non-serializable objects.
See also
feets.FeatureSpaceClass to select and extract features from a time series.
read_yamlyaml.safe_dump
feets.preprocess module¶
Functions for preprocessing light curve data vectors.
- feets.preprocess.align(time, time2, magnitude, magnitude2, error=None, error2=None)[source]¶
Synchronizes two light curves in different bands.
- Parameters:
- timearray-like
- time2array-like
- magnitudearray-like
- magnitude2array-like
- errorarray-like, optional
- error2array-like, optional
- Returns:
- aligned_timearray-like
- aligned_magnitudearray-like
- aligned_magnitude2array-like
- aligned_errorarray-like
- aligned_error2array-like
- feets.preprocess.remove_noise(time, magnitude, error, error_limit=3, std_limit=5)[source]¶
Removes noise from the light curve data vectors.
Points within std_limit standard deviations from the mean and with errors greater than error_limit times the error mean are considered as noise and thus are eliminated.
- Parameters:
- timearray-like
- magnitudearray-like
- errorarray-like
- error_limitfloat, default=3
- std_limitfloat, default=5
- Returns:
- time_cleanarray-like
- magnitude_cleanarray-like
- error_cleanarray-like
feets.runner module¶
Run multiple feature extractors in parallel.
- exception feets.runner.DataRequiredError[source]¶
Bases:
ValueErrorA required data vector is missing from the light curve.
- feets.runner.run(*, extractors, selected_features, required_data, lcs, dask_options=None)[source]¶
Run instances of feature extractors on a collection of light curves.
Executes the specified extractor instances on each provided light curve, returning the extracted features for each. Feature extraction is performed in parallel using Dask, enabling efficient computation across multiple light curves. The order of execution respects dependencies between extractors; ensure that the extractors list is topologically sorted so that dependencies are satisfied.
- Parameters:
- extractorsarray_like of feets.extractors.Extractor
Feature extractor instances to apply. Must be sorted so that any extractor appears after those it depends on.
- selected_featuresarray_like of str
Names of features to extract from each light curve.
- required_dataarray_like of str
Names of required data fields that must be present in each light curve.
- lcsarray_like of dict
Light curves to process, each represented as a dictionary of data vectors.
- dask_optionsdict, optional
Options for the Dask scheduler. Defaults to
{"scheduler": "processes"}.
- Returns:
- list of dict
List of dictionaries, one per input light curve, with the extracted feature values. Each dictionary contains the extracted features specified in selected_features. The order of the list matches the input lcs.
- Raises:
- DataRequiredError
If any of the required data vectors are missing from a light curve.
See also
feets.ExtractorAbstract base class for feature extractors.
feets.FeatureSpaceClass to select and extract features from a time series.
dask.compute
Notes
Feature extraction is parallelized using Dask. You can control parallelism and scheduler behavior via the dask_options parameter.
For more information on Dask, see: https://docs.dask.org/en/stable/
Examples
>>> from feets.extractors import Mean >>> >>> # Instantiate the feature extractor >>> mean_extractor = Mean() >>> >>> # Light curves to process >>> lcs = [{"magnitude": [1, 2, 3]}, {"magnitude": [4, 5, 6]}] >>> >>> # Run the feature extraction >>> run( ... extractors=[mean_extractor], ... selected_features=["Mean"], ... required_data=["magnitude"], ... lcs=lcs ... ) [{'Mean': np.float64(2.0)}, {'Mean': np.float64(5.0)}]
Module contents¶
feets: feATURE eXTRACTOR FOR tIME sERIES.
In time-domain astronomy, data gathered from the telescopes is usually represented in the form of light-curves. These are time series that show the brightness variation of an object through a period of time (for a visual representation see video below). Based on the variability characteristics of the light-curves, celestial objects can be classified into different groups (quasars, long period variables, eclipsing binaries, etc.) and consequently be studied in depth independentely.
In order to characterize this variability, some of the existing methods use machine learning algorithms that build their decision on the light-curves features. Features, the topic of the following work, are numerical descriptors that aim to characterize and distinguish the different variability classes. They can go from basic statistical measures such as the mean or the standard deviation, to complex time-series characteristics such as the autocorrelation function.
In this package we present a library with a compilation of some of the existing light-curve features. The main goal is to create a collaborative and open tool where every user can characterize or analyze an astronomical photometric database while also contributing to the library by adding new features. However, it is important to highlight that this library is not restricted to the astronomical field and could also be applied to any kind of time series.
Our vision is to be capable of analyzing and comparing light-curves from all the available astronomical catalogs in a standard and universal way. This would facilitate and make more efficient tasks as modelling, classification, data cleaning, outlier detection and data analysis in general. Consequently, when studying light-curves, astronomers and data analysts would be on the same wavelength and would not have the necessity to find a way of comparing or matching different features. In order to achieve this goal, the library should be run in every existent survey (MACHO, EROS, OGLE, Catalina, Pan-STARRS, etc) and future surveys (LSST) and the results should be ideally shared in the same open way as this library.
- feets.read_json(path_or_buffer)[source]¶
Deserialize a JSON formatted string or file to feets.FeatureSpace.
- Parameters:
- path_or_bufferstr, pathlib.Path or file-like object
The file path, buffer, or stream to read the JSON data from.
- Returns:
- feets.FeatureSpace
A feets.FeatureSpace object containing the deserialized data.
See also
feets.FeatureSpaceClass to select and extract features from a time series.
store_json
- feets.read_yaml(path_or_buffer)[source]¶
Deserialize a YAML formatted string or file to feets.FeatureSpace.
- Parameters:
- path_or_bufferstr, pathlib.Path or file-like object
The file path, buffer, or stream to read the YAML data from.
- Returns:
- feets.FeatureSpace
A feets.FeatureSpace object containing the deserialized data.
See also
feets.FeatureSpaceClass to select and extract features from a time series.
store_yaml