Python API¶

Database¶

rrgp.database.download_dataset()[source]¶: Download raw dataset from url and unzip it

rrgp.database.transform_to_text_labels(labels)[source]¶

Transform numerical labels to corresponding text

Parameters: labels (array) – an array of numerical labels
Returns: labels – same array with corresponding text labels
Return type: array

rrgp.database.get_dataset_split(data_path, labels_path)[source]¶

Get data and ground-truth of selected split

Parameters

data_path (str) – data file location
labels_path (str) – labels file location

Returns

data (array) – all the data of the split
labels (array) – all the corresponding labels (ground-truth)

rrgp.database.load(standardized=False, printSize=False, train_data_path=None, train_labels_path=None, test_data_path=None, test_labels_path=None)[source]¶

Get the dataset and the corresponding labels split into a training and a testing set

Parameters

standardized (bool) – standardize the data before returning them or not

Returns

train_data (array)
train_labels (array)
test_data (array)
test_labels (array)

Pre-processor¶

rrgp.preprocessor.standardize(train_data, test_data)[source]¶

Standardize training and testing data

Parameters

train_data (array) – Data on which to calculate the standardization parameters. The standardization is also applied on this subset.
test_data (array) – Test subset on which to apply the standardization.

Returns

train_data (array) – Standardized training data
test_data (array) – Standardized testing data

Machine Learning Algorithm¶

rrgp.algorithm.train(X, Y, args)[source]¶

Train a model given the arguments, the dataset and the corresponding labels (ground-truth)

Parameters

X (array) – features of the dataset
Y (array) – corresponding labels
args (dict) – arguments to prepare the model

Returns

model – trained model

Return type

object

rrgp.algorithm.predict(X, model)[source]¶

Predict labels given the features and the trained model

Parameters

X (array) – features to predict on
model (object) – trained model

Returns

predictions – Array with the predicted labels

Return type

array

Analysis¶

rrgp.evaluator.get_metrics_table(predictions, test_labels)[source]¶

Generate a metrics table to evaluate predictions

Parameters

predictions (array) – Predictions of a model
test_labels (array) – Corresponding ground-truth

Returns

table – Nicely formatted plain-text table with the computed metrics

Return type

string

rrgp.evaluator.get_table_header(model_name, model)[source]¶

Generate a header for the metrics table

Parameters

model_name (str) – Type of model (svm or rf)
model (object) – Trained model from which to get the parameters

Returns

header – Nicely formatted text header

Return type

string

rrgp.evaluator.evaluate(predictions, test_data, test_labels, output_dir, model_name, model)[source]¶

Evaluate the predictions given the ground-truth. Save a table with the metrics and a png file with the confusion matrix.

Parameters

predictions (array) – Predictions of a model
test_labels (array) – Corresponding ground-truth
output_dir (str) – Folder name in which to save table and figure
model_name (str) – Model type (svm or rf)
model (object) – trained model