Features
Features from the current release.
Data Loading and Handling
get_dataset
: Load a dataset.get_training_test_data
: Split the dataset into training and test sets.load_large_dataset
: Load a large dataset efficiently.reduce_data_memory_useage
: Reduce memory usage of the dataset.
Data Cleaning and Manipulation
drop_columns
: Drop specified columns from the dataset.fix_missing_values
: Handle missing values in the dataset.fix_unbalanced_dataset
: Address class imbalance in a classification dataset.filter_data
: Filter data based on specified conditions.remove_duplicates
: Remove duplicate rows from the dataset.rename_columns
: Rename columns in the dataset.replace_values
: Replace specified values in the dataset.reset_index
: Reset the index of the dataset.set_index
: Set a specific column as the index.sort_index
: Sort the index of the dataset.sort_values
: Sort the values of the dataset.
Data Formatting and Transformation
categorical_to_datetime
: Convert categorical columns to datetime format.categorical_to_numerical
: Convert categorical columns to numerical format.numerical_to_categorical
: Convert numerical columns to categorical format.column_binning
: Bin values in a column into specified bins.
Exploratory Data Analysis
eda
: Perform exploratory data analysis on the dataset.eda_visual
: Visualize exploratory data analysis results.pandas_profiling
: Generate a Pandas Profiling report for the dataset.sweetviz_profile_report
: Generate a Sweetviz Profile Report for the dataset.count_column_categories
: Count the categories in a categorical column.unique_elements_in_columns
: Get the unique elements that exist in each column in the dataset.
Feature Engineering
extract_date_features
: Extract date-related features from a datetime column.polyreg_x
: Get the polynomial regression x for independent variables after specifying the degree.select_features
: Select relevant features for modeling.select_dependent_and_independent
: Select dependent and independent variables.
Data Preprocessing
scale_independent_variables
: Scale independent variables in the dataset.remove_outlier
: Remove outliers from the dataset.split_data
: Split the dataset into training and test sets.
Model Building and Evaluation
poly_get_optimal_degree
: Find the best degree for polynomial regression.get_bestK_KNNregressor
: Find the best K value for KNN regression.train_model_regressor
: Train a regression model.regressor_predict
: Make predictions using a regression model.regressor_evaluation
: Evaluate the performance of a regression model.regressor_model_testing
: Test a regression model.polyreg_graph
: Visualize a polynomial regression graph.simple_linregres_graph
: Visualize a regression graph.build_multiple_regressors
: Build multiple regression models.build_multiple_regressors_from_features
: Build regression models using selected features.build_single_regressor_from_features
: Build a single regression model using selected features.get_bestK_KNNclassifier
: Find the best K value for KNN classification.train_model_classifier
: Train a classification model.classifier_predict
: Make predictions using a classification model.classifier_evaluation
: Evaluate the performance of a classification model.classifier_model_testing
: Test a classification model.classifier_graph
: Visualize a classification graph.build_multiple_classifiers
: Build multiple classification models.build_multiple_classifiers_from_features
: Build classification models using selected features.build_single_classifier_from_features
: Build a single classification model using selected features.
Data Aggregation and Summarization
group_data
: Group and summarize data based on specified conditions.
Data Type Handling
select_datatype
: Select columns of a specific datatype in the dataset.