Python API

Database

rrgp.database.download_dataset()[source]

Download raw dataset from url and unzip it

rrgp.database.transform_to_text_labels(labels)[source]

Transform numerical labels to corresponding text

Parameters

labels (array) – an array of numerical labels

Returns

labels – same array with corresponding text labels

Return type

array

rrgp.database.get_dataset_split(data_path, labels_path)[source]

Get data and ground-truth of selected split

Parameters
  • data_path (str) – data file location

  • labels_path (str) – labels file location

Returns

  • data (array) – all the data of the split

  • labels (array) – all the corresponding labels (ground-truth)

rrgp.database.load(standardized=False, printSize=False, train_data_path=None, train_labels_path=None, test_data_path=None, test_labels_path=None)[source]

Get the dataset and the corresponding labels split into a training and a testing set

Parameters

standardized (bool) – standardize the data before returning them or not

Returns

  • train_data (array)

  • train_labels (array)

  • test_data (array)

  • test_labels (array)

Pre-processor

rrgp.preprocessor.standardize(train_data, test_data)[source]

Standardize training and testing data

Parameters
  • train_data (array) – Data on which to calculate the standardization parameters. The standardization is also applied on this subset.

  • test_data (array) – Test subset on which to apply the standardization.

Returns

  • train_data (array) – Standardized training data

  • test_data (array) – Standardized testing data

Machine Learning Algorithm

rrgp.algorithm.train(X, Y, args)[source]

Train a model given the arguments, the dataset and the corresponding labels (ground-truth)

Parameters
  • X (array) – features of the dataset

  • Y (array) – corresponding labels

  • args (dict) – arguments to prepare the model

Returns

model – trained model

Return type

object

rrgp.algorithm.predict(X, model)[source]

Predict labels given the features and the trained model

Parameters
  • X (array) – features to predict on

  • model (object) – trained model

Returns

predictions – Array with the predicted labels

Return type

array

Analysis

rrgp.evaluator.get_metrics_table(predictions, test_labels)[source]

Generate a metrics table to evaluate predictions

Parameters
  • predictions (array) – Predictions of a model

  • test_labels (array) – Corresponding ground-truth

Returns

table – Nicely formatted plain-text table with the computed metrics

Return type

string

rrgp.evaluator.get_table_header(model_name, model)[source]

Generate a header for the metrics table

Parameters
  • model_name (str) – Type of model (svm or rf)

  • model (object) – Trained model from which to get the parameters

Returns

header – Nicely formatted text header

Return type

string

rrgp.evaluator.evaluate(predictions, test_data, test_labels, output_dir, model_name, model)[source]

Evaluate the predictions given the ground-truth. Save a table with the metrics and a png file with the confusion matrix.

Parameters
  • predictions (array) – Predictions of a model

  • test_labels (array) – Corresponding ground-truth

  • output_dir (str) – Folder name in which to save table and figure

  • model_name (str) – Model type (svm or rf)

  • model (object) – trained model