plotting Package

plotting Package

compute_CI Module

WORC.plotting.compute_CI.compute_confidence(metric, N_train, N_test, alpha=0.95)[source]

Function to calculate the adjusted confidence interval for cross-validation. metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95

WORC.plotting.compute_CI.compute_confidence_bootstrap(bootstrap_metric, test_metric, N_1, alpha=0.95)[source]

Function to calculate confidence interval for bootstrapped samples. metric: numpy array containing the result for a metric for the different bootstrap iterations test_metric: the value of the metric evaluated on the true, full test set alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95

WORC.plotting.compute_CI.compute_confidence_logit(metric, N_train, N_test, alpha=0.95)[source]

Function to calculate the adjusted confidence interval metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 95%

linstretch Module

WORC.plotting.linstretch.linstretch(i, i_max=255, i_min=0)[source]

Stretch the input image i pixel values from i_min to i_max

plot_ROC Module

WORC.plotting.plot_ROC.ROC_thresholding(fprt, tprt, thresholds, nsamples=20)[source]

Construct FPR and TPR ratios at different thresholds for the scores of an estimator.

WORC.plotting.plot_ROC.main()[source]
WORC.plotting.plot_ROC.plot_ROC(prediction, pinfo, ensemble=1, label_type=None, output_png=None, output_tex=None, output_csv=None)[source]
WORC.plotting.plot_ROC.plot_ROC_CIc(y_truth, y_score, N_1, N_2, plot='default', alpha=0.95, verbose=False, DEBUG=False, tsamples=20)[source]

Plot a Receiver Operator Characteristic (ROC) curve with confidence intervals.

tsamples: number of sample points on which to determine the confidence intervals.

The sample pointsare used on the thresholds for y_score.

WORC.plotting.plot_ROC.plot_single_ROC(y_truth, y_score, verbose=False, returnplot=False)[source]

Get the False Positive Ratio (FPR) and True Positive Ratio (TPR) for the ground truth and score of a single estimator. These ratios can be used to plot a Receiver Operator Characteristic (ROC) curve.

plot_barchart Module

WORC.plotting.plot_barchart.count_parameters(parameters)[source]
WORC.plotting.plot_barchart.main()[source]
WORC.plotting.plot_barchart.paracheck(parameters)[source]
WORC.plotting.plot_barchart.plot_barchart(prediction, estimators=10, label_type=None, output_tex=None, output_png=None)[source]

Make a barchart of the top X hyperparameters settings of the ranked estimators in all cross validation iterations.

prediction: filepath, mandatory

Path pointing to the .hdf5 file which was is the output of the trainclassifier function.

estimators: integer, default 10

Number of hyperparameter settings/estimators used in each cross validation. The settings are ranked, so when supplying e.g. 10, the best 10 settings in each cross validation setting will be used.

label_type: string, default None

The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.

output_tex: filepath, optional

If given, the barchart will be written to this tex file.

output_png: filepath, optional

If given, the barchart will be written to this png file.

fig: matplotlib figure

The figure in which the barchart is plotted.

WORC.plotting.plot_barchart.plot_bars(params, normalization_factor=None, figwidth=40, fontsize=30, spacing=2)[source]

plot_boxplot_features Module

WORC.plotting.plot_boxplot_features.generate_feature_boxplots(image_features, label_data, output_zip, dpi=500, verbose=False)[source]

Generate boxplots of the feature values among different objects.

features: list, mandatory

List with a dictionary of the feature labels and values for each patient.

label_data: pandas dataframe, mandatory

Dataframe containing the labels of the objects.

outputfolder: path, mandatory

Folder to which the output boxplots should be written.

WORC.plotting.plot_boxplot_features.plot_boxplot_features(features, label_data, config, output_zip, label_type=None, verbose=False)[source]

plot_boxplot_performance Module

WORC.plotting.plot_boxplot_performance.generate_performance_boxplots(performances, metrics, outputfolder, colors=None)[source]

Generate boxplots for performance of various models

WORC.plotting.plot_boxplot_performance.test()[source]

plot_estimator_performance Module

WORC.plotting.plot_estimator_performance.combine_multiple_estimators(predictions, label_data, multilabel_type, label_types, ensemble=1, strategy='argmax', alpha=0.95)[source]

Combine multiple estimators in a single model.

Note: the multilabel_type labels should correspond to the ordering in label_types. Hence, if multilabel_type = 0, the prediction is label_type[0] etc.

WORC.plotting.plot_estimator_performance.compute_statistics(y_truth, y_score, y_prediction, modus, regression)[source]

Compute statistics on predictions.

WORC.plotting.plot_estimator_performance.fit_thresholds(thresholds, estimator, X_train, Y_train, ensemble, ensemble_scoring)[source]
WORC.plotting.plot_estimator_performance.main()[source]
WORC.plotting.plot_estimator_performance.plot_estimator_performance(prediction, label_data, label_type, crossval_type=None, alpha=0.95, ensemble=None, verbose=True, ensemble_scoring=None, output=None, modus=None, thresholds=None, survival=False, shuffle_estimators=False, bootstrap=None, bootstrap_N=None, overfit_scaler=None)[source]

Plot the output of a single estimator, e.g. a SVM.

prediction: pandas dataframe or string, mandatory

output of trainclassifier function, either a pandas dataframe or a HDF5 file

label_data: string, mandatory

Contains the path referring to a .txt file containing the patient label(s) and value(s) to be used for learning. See the Github Wiki for the format.

label_type: string, mandatory

Name of the label to extract from the label data to test the estimator on.

alpha: float, default 0.95

Significance of confidence intervals.

ensemble: False, integer or ‘Caruana’

Determine whether an ensemble will be created. If so, either provide an integer to determine how many of the top performing classifiers should be in the ensemble, or use the string “Caruana” to use smart ensembling based on Caruana et al. 2004.

verbose: boolean, default True

Plot intermedate messages.

ensemble_scoring: string, default None

Metric to be used for evaluating the ensemble. If None, the option set in the prediction object will be used.

output: string, default stats

Determine which results are put out. If stats, the statistics of the estimator will be returned. If scores, the scores will be returned.

thresholds: list of integer(s), default None

If None, use default threshold of sklearn (0.5) on posteriors to converge to a binary prediction. If one integer is provided, use that one. If two integers are provided, posterior < thresh[0] = 0, posterior > thresh[1] = 1.

Depending on the output parameters, the following outputs are returned:

If output == ‘stats’: stats: dictionary

Contains the confidence intervals of the performance metrics and the number of times each patient was classifier correctly or incorrectly.

If output == ‘scores’: y_truths: list

Contains the true label for each object.

y_scores: list

Contains the score (e.g. posterior) for each object.

y_predictions: list

Contains the predicted label for each object.

pids: list

Contains the patient ID/name for each object.

plot_hyperparameters Module

WORC.plotting.plot_hyperparameters.plot_hyperparameters(prediction, label_type=None, estsize=50, output=None, removeconstants=False, verbose=False)[source]

Gather which hyperparameters have been used in the best workflows.

prediction: pandas dataframe or string, mandatory

output of trainclassifier function, either a pandas dataframe or a HDF5 file

estsize: integer, default 50

Number of estimators that should be taken into account.

output: filename of csv, default None

Output file to write to. If None, not output is written, but just returned as a variable.

removeconstants: boolean, default False

Determine whether to remove any hyperparameters which have the same value in all workflows.

verbose: boolean, default False

Whether to show print messages or not.

plot_images Module

WORC.plotting.plot_images.bbox_2D(img, mask, padding=[1, 1], img2=None)[source]
WORC.plotting.plot_images.extract_boundary(contour, radius=2)[source]
WORC.plotting.plot_images.plot_im_and_overlay(image, mask=None, figsize=3, 3, alpha=0.4)[source]

Plot an image in a matplotlib figure and overlay with a mask.

WORC.plotting.plot_images.slicer(image, mask=None, output_name=None, output_name_zoom=None, thresholds=[- 240, 160], zoomfactor=4, dpi=500, normalize=False, expand=False, boundary=False, square=False, flip=True, alpha=0.4, index=None)[source]

image and mask should both be arrays

plot_pvalues_features Module

WORC.plotting.plot_pvalues_features.manhattan_importance(values, labels, feature_labels, output_png=None, output_tex=None, mapping=None, threshold_annotated=0.05)[source]

plot_ranked_scores Module

WORC.plotting.plot_ranked_scores.example()[source]
WORC.plotting.plot_ranked_scores.main()[source]
WORC.plotting.plot_ranked_scores.plot_ranked_images(pinfo, label_type, images, segmentations, ranked_truths, ranked_scores, ranked_PIDs, output_zip=None, output_itk=None, zoomfactor=4)[source]
WORC.plotting.plot_ranked_scores.plot_ranked_percentages(estimator, pinfo, label_type=None, ensemble=50, output_csv=None)[source]
WORC.plotting.plot_ranked_scores.plot_ranked_posteriors(estimator, pinfo, label_type=None, ensemble=50, output_csv=None)[source]
WORC.plotting.plot_ranked_scores.plot_ranked_scores(estimator, pinfo, label_type, scores='percentages', images=[], segmentations=[], ensemble=50, output_csv=None, output_zip=None, output_itk=None)[source]

Rank the patients according to their average score. The score can either be the average posterior or the percentage of times the patient was classified correctly in the cross validations. Additionally, the middle slice of each patient is plot and saved according to the ranking.

estimator: filepath, mandatory

Path pointing to the .hdf5 file which was is the output of the trainclassifier function.

pinfo: filepath, mandatory

Path pointint to the .txt file which contains the patient label information.

label_type: string, default None

The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.

scores: string, default percentages

Type of scoring to be used. Either ‘posteriors’ or ‘percentages’.

images: list, optional

List containing the filepaths to the ITKImage image files of the patients.

segmentations: list, optional

List containing the filepaths to the ITKImage segmentation files of the patients.

ensemble: integer or string, optional

Method to be used for ensembling. Either an integer for a fixed size or ‘Caruana’ for the Caruana method, see the SearchCV function for more details.

output_csv: filepath, optional

If given, the scores will be written to this csv file.

output_zip: filepath, optional

If given, the images will be plotted and the pngs saved to this zip file.

output_itk: filepath, optional

WIP

plotminmaxresponse Module

WORC.plotting.plotminmaxresponse.main()[source]

scatterplot Module

WORC.plotting.scatterplot.main()[source]
WORC.plotting.scatterplot.make_scatterplot(features, label_file, feature_label_1, feature_label_2, output)[source]