Features

Features from the current release.

Data Loading and Handling

  • get_dataset: Load a dataset.

  • get_training_test_data: Split the dataset into training and test sets.

  • load_large_dataset: Load a large dataset efficiently.

  • reduce_data_memory_useage: Reduce memory usage of the dataset.

Data Cleaning and Manipulation

  • drop_columns: Drop specified columns from the dataset.

  • fix_missing_values: Handle missing values in the dataset.

  • fix_unbalanced_dataset: Address class imbalance in a classification dataset.

  • filter_data: Filter data based on specified conditions.

  • remove_duplicates: Remove duplicate rows from the dataset.

  • rename_columns: Rename columns in the dataset.

  • replace_values: Replace specified values in the dataset.

  • reset_index: Reset the index of the dataset.

  • set_index: Set a specific column as the index.

  • sort_index: Sort the index of the dataset.

  • sort_values: Sort the values of the dataset.

Data Formatting and Transformation

  • categorical_to_datetime: Convert categorical columns to datetime format.

  • categorical_to_numerical: Convert categorical columns to numerical format.

  • numerical_to_categorical: Convert numerical columns to categorical format.

  • column_binning: Bin values in a column into specified bins.

Exploratory Data Analysis

  • eda: Perform exploratory data analysis on the dataset.

  • eda_visual: Visualize exploratory data analysis results.

  • pandas_profiling: Generate a Pandas Profiling report for the dataset.

  • sweetviz_profile_report: Generate a Sweetviz Profile Report for the dataset.

  • count_column_categories: Count the categories in a categorical column.

  • unique_elements_in_columns: Get the unique elements that exist in each column in the dataset.

Feature Engineering

  • extract_date_features: Extract date-related features from a datetime column.

  • polyreg_x: Get the polynomial regression x for independent variables after specifying the degree.

  • select_features: Select relevant features for modeling.

  • select_dependent_and_independent: Select dependent and independent variables.

Data Preprocessing

  • scale_independent_variables: Scale independent variables in the dataset.

  • remove_outlier: Remove outliers from the dataset.

  • split_data: Split the dataset into training and test sets.

Model Building and Evaluation

  • poly_get_optimal_degree: Find the best degree for polynomial regression.

  • get_bestK_KNNregressor: Find the best K value for KNN regression.

  • train_model_regressor: Train a regression model.

  • regressor_predict: Make predictions using a regression model.

  • regressor_evaluation: Evaluate the performance of a regression model.

  • regressor_model_testing: Test a regression model.

  • polyreg_graph: Visualize a polynomial regression graph.

  • simple_linregres_graph: Visualize a regression graph.

  • build_multiple_regressors: Build multiple regression models.

  • build_multiple_regressors_from_features: Build regression models using selected features.

  • build_single_regressor_from_features: Build a single regression model using selected features.

  • get_bestK_KNNclassifier: Find the best K value for KNN classification.

  • train_model_classifier: Train a classification model.

  • classifier_predict: Make predictions using a classification model.

  • classifier_evaluation: Evaluate the performance of a classification model.

  • classifier_model_testing: Test a classification model.

  • classifier_graph: Visualize a classification graph.

  • build_multiple_classifiers: Build multiple classification models.

  • build_multiple_classifiers_from_features: Build classification models using selected features.

  • build_single_classifier_from_features: Build a single classification model using selected features.

Data Aggregation and Summarization

  • group_data: Group and summarize data based on specified conditions.

Data Type Handling

  • select_datatype: Select columns of a specific datatype in the dataset.