plotting Package¶

`plotting` Package¶

`compute_CI` Module¶

WORC.plotting.compute_CI.compute_confidence(metric, N_train, N_test, alpha=0.95)[source]¶: Function to calculate the adjusted confidence interval for cross-validation. metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95

WORC.plotting.compute_CI.compute_confidence_bootstrap(bootstrap_metric, test_metric, N_1, alpha=0.95)[source]¶: Function to calculate confidence interval for bootstrapped samples. metric: numpy array containing the result for a metric for the different bootstrap iterations test_metric: the value of the metric evaluated on the true, full test set alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 0.95

WORC.plotting.compute_CI.compute_confidence_logit(metric, N_train, N_test, alpha=0.95)[source]¶: Function to calculate the adjusted confidence interval metric: numpy array containing the result for a metric for the different cross validations (e.g. If 20 cross-validations are performed it is a list of length 20 with the calculated accuracy for each cross validation) N_train: Integer, number of training samples N_test: Integer, number of test_samples alpha: float ranging from 0 to 1 to calculate the alpha*100% CI, default 95%

`linstretch` Module¶

WORC.plotting.linstretch.linstretch(i, i_max=255, i_min=0)[source]¶: Stretch the input image i pixel values from i_min to i_max

`plot_ROC` Module¶

WORC.plotting.plot_ROC.curve_thresholding(metric1t, metric2t, thresholds, nsamples=20)[source]¶: Construct metric1 and metric2 (either FPR and TPR, or TPR and Precision) ratios at different thresholds for the scores of an estimator.

WORC.plotting.plot_ROC.main()[source]¶

WORC.plotting.plot_ROC.plot_PRC_CIc(y_truth, y_score, N_1, N_2, plot='default', alpha=0.95, verbose=False, DEBUG=False, tsamples=20)[source]¶

Plot a Precision-Recall curve with confidence intervals.

tsamples: number of sample points on which to determine the confidence intervals.: The sample pointsare used on the thresholds for y_score.

WORC.plotting.plot_ROC.plot_ROC(prediction, pinfo, ensemble_method='top_N', ensemble_size=1, label_type=None, ROC_png=None, ROC_tex=None, ROC_csv=None, PRC_png=None, PRC_tex=None, PRC_csv=None)[source]¶

WORC.plotting.plot_ROC.plot_ROC_CIc(y_truth, y_score, N_1, N_2, plot='default', alpha=0.95, verbose=False, DEBUG=False, tsamples=20)[source]¶

Plot a Receiver Operator Characteristic (ROC) curve with confidence intervals.

tsamples: number of sample points on which to determine the confidence intervals.: The sample pointsare used on the thresholds for y_score.

WORC.plotting.plot_ROC.plot_single_PRC(y_truth, y_score, verbose=False, returnplot=False)[source]¶: Get the precision and recall (=true positive rate) for the ground truth and score of a single estimator. These ratios can be used to plot a Precision Recall Curve (ROC).

WORC.plotting.plot_ROC.plot_single_ROC(y_truth, y_score, verbose=False, returnplot=False)[source]¶: Get the False Positive Ratio (FPR) and True Positive Ratio (TPR) for the ground truth and score of a single estimator. These ratios can be used to plot a Receiver Operator Characteristic (ROC) curve.

`plot_barchart` Module¶

WORC.plotting.plot_barchart.count_parameters(parameters)[source]¶

WORC.plotting.plot_barchart.main()[source]¶

WORC.plotting.plot_barchart.paracheck(parameters)[source]¶

WORC.plotting.plot_barchart.plot_barchart(prediction, estimators=10, label_type=None, output_tex=None, output_png=None)[source]¶

Make a barchart of the top X hyperparameters settings of the ranked estimators in all cross validation iterations.

Parameters¶

prediction: filepath, mandatory: Path pointing to the .hdf5 file which was is the output of the trainclassifier function.
estimators: integer, default 10: Number of hyperparameter settings/estimators used in each cross validation. The settings are ranked, so when supplying e.g. 10, the best 10 settings in each cross validation setting will be used.
label_type: string, default None: The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.
output_tex: filepath, optional: If given, the barchart will be written to this tex file.
output_png: filepath, optional: If given, the barchart will be written to this png file.

Returns¶

fig: matplotlib figure: The figure in which the barchart is plotted.

WORC.plotting.plot_barchart.plot_bars(params, normalization_factor=None, figwidth=40, fontsize=30, spacing=2)[source]¶

`plot_boxplot_features` Module¶

WORC.plotting.plot_boxplot_features.generate_feature_boxplots(image_features, label_data, output_zip, dpi=500, verbose=False)[source]¶

Generate boxplots of the feature values among different objects.

Parameters¶

features: list, mandatory: List with a dictionary of the feature labels and values for each patient.
label_data: pandas dataframe, mandatory: Dataframe containing the labels of the objects.
outputfolder: path, mandatory: Folder to which the output boxplots should be written.

WORC.plotting.plot_boxplot_features.plot_boxplot_features(features, label_data, config, output_zip, label_type=None, verbose=False)[source]¶

`plot_boxplot_performance` Module¶

WORC.plotting.plot_boxplot_performance.generate_performance_boxplots(performances, metrics, outputfolder, colors=None)[source]¶: Generate boxplots for performance of various models.

WORC.plotting.plot_boxplot_performance.test()[source]¶: Test functionality with synthetic data.

`plot_errors` Module¶

WORC.plotting.plot_errors.plot_errors(featurefiles, patientinfo, label_type, featurenames, posteriors_csv=None, agesex=True, output_png=None, output_tex=None)[source]¶: Scatterplot of all objects with marking of errors.

`plot_estimator_performance` Module¶

WORC.plotting.plot_estimator_performance.combine_multiple_estimators(predictions, label_data, N_1, N_2, multilabel_type, label_types, ensemble=1, strategy='argmax', alpha=0.95)[source]¶

Combine multiple estimators in a single model.

Note: the multilabel_type labels should correspond to the ordering in label_types. Hence, if multilabel_type = 0, the prediction is label_type[0] etc.

WORC.plotting.plot_estimator_performance.compute_statistics(y_truth, y_score, y_prediction, modus, regression)[source]¶: Compute statistics on predictions.

WORC.plotting.plot_estimator_performance.fit_thresholds(thresholds, estimator, label_type, X_train, Y_train, ensemble_method, ensemble_size, ensemble_scoring)[source]¶

WORC.plotting.plot_estimator_performance.main()[source]¶

WORC.plotting.plot_estimator_performance.plot_estimator_performance(prediction, label_data, label_type, crossval_type=None, alpha=0.95, ensemble_method='top_N', ensemble_size=100, verbose=True, ensemble_scoring=None, output=None, modus=None, thresholds=None, survival=False, shuffle_estimators=False, bootstrap=None, bootstrap_N=None, overfit_scaler=None, save_memory=True, refit_ensemble=False)[source]¶

Plot the output of a single estimator, e.g. a SVM.

Parameters¶

prediction: pandas dataframe or string, mandatory: output of trainclassifier function, either a pandas dataframe or a HDF5 file
label_data: string, mandatory: Contains the path referring to a .txt file containing the patient label(s) and value(s) to be used for learning. See the Github Wiki for the format.
label_type: string, mandatory: Name of the label to extract from the label data to test the estimator on.
alpha: float, default 0.95: Significance of confidence intervals.
ensemble_method: string, default ‘top_N’: Determine which method to use for creating the ensemble. Choices: top_N or Caruana
ensemble_size: int, default 50: Determine the size of the ensemble. Only relevant for top_N
verbose: boolean, default True: Plot intermedate messages.
ensemble_scoring: string, default None: Metric to be used for evaluating the ensemble. If None, the option set in the prediction object will be used.
output: string, default stats: Determine which results are put out. If stats, the statistics of the estimator will be returned. If scores, the scores will be returned.
thresholds: list of integer(s), default None: If None, use default threshold of sklearn (0.5) on posteriors to converge to a binary prediction. If one integer is provided, use that one. If two integers are provided, posterior < thresh[0] = 0, posterior > thresh[1] = 1.

Returns¶

Depending on the output parameters, the following outputs are returned:

If output == ‘stats’: stats: dictionary

Contains the confidence intervals of the performance metrics and the number of times each patient was classifier correctly or incorrectly.

If output == ‘scores’: y_truths: list

Contains the true label for each object.

y_scores: list: Contains the score (e.g. posterior) for each object.
y_predictions: list: Contains the predicted label for each object.
pids: list: Contains the patient ID/name for each object.

`plot_hyperparameters` Module¶

WORC.plotting.plot_hyperparameters.plot_hyperparameters(prediction, label_type=None, estsize=50, output=None, removeconstants=False, verbose=False)[source]¶

Gather which hyperparameters have been used in the best workflows.

Parameters¶

prediction: pandas dataframe or string, mandatory: output of trainclassifier function, either a pandas dataframe or a HDF5 file
estsize: integer, default 50: Number of estimators that should be taken into account.
output: filename of csv, default None: Output file to write to. If None, not output is written, but just returned as a variable.
removeconstants: boolean, default False: Determine whether to remove any hyperparameters which have the same value in all workflows.
verbose: boolean, default False: Whether to show print messages or not.

`plot_images` Module¶

WORC.plotting.plot_images.bbox_2D(img, mask, padding=[1, 1], img2=None)[source]¶

WORC.plotting.plot_images.extract_boundary(contour, radius=2)[source]¶

WORC.plotting.plot_images.plot_im_and_overlay(image, mask=None, figsize=(3, 3), alpha=0.4, color='cyan', colormap='gray', colorbar=False)[source]¶: Plot an image in a matplotlib figure and overlay with a mask.

WORC.plotting.plot_images.slicer(image, mask=None, output_name=None, output_name_zoom=None, thresholds=[-5, 5], zoomfactor=4, dpi=500, normalize=True, expand=False, boundary=False, square=False, flip=True, rot90=0, alpha=0.4, axis='axial', index=None, color='cyan', radius=2, colormap='gray', fill=False)[source]¶

Plot slice of image where mask is largest, with mask as overlay.

image and mask should both be arrays

`plot_pvalues_features` Module¶

WORC.plotting.plot_pvalues_features.manhattan_importance(values, labels, feature_labels, output_png=None, output_tex=None, mapping=None, threshold_annotated=0.05)[source]¶

`plot_ranked_scores` Module¶

WORC.plotting.plot_ranked_scores.flatten_object(input)[source]¶: Flatten various objects to a 1D list.

WORC.plotting.plot_ranked_scores.main()[source]¶

WORC.plotting.plot_ranked_scores.plot_ranked_images(pinfo, label_type, images, segmentations, ranked_truths, ranked_scores, ranked_PIDs, output_zip=None, output_itk=None, zoomfactor=4, scores='percentages')[source]¶

WORC.plotting.plot_ranked_scores.plot_ranked_percentages(estimator, pinfo, label_type=None, ensemble_method='top_N', ensemble_size=100, output_csv=None)[source]¶

WORC.plotting.plot_ranked_scores.plot_ranked_posteriors(estimator, pinfo, label_type=None, ensemble_method='top_N', ensemble_size=100, output_csv=None)[source]¶

WORC.plotting.plot_ranked_scores.plot_ranked_scores(estimator, pinfo, label_type, scores='percentages', images=[], segmentations=[], ensemble_method='top_N', ensemble_size=100, output_csv=None, output_zip=None, output_itk=None)[source]¶

Rank the patients according to their average score. The score can either be the average posterior or the percentage of times the patient was classified correctly in the cross validations. Additionally, the middle slice of each patient is plot and saved according to the ranking.

Parameters¶

estimator: filepath, mandatory: Path pointing to the .hdf5 file which was is the output of the trainclassifier function.
pinfo: filepath, mandatory: Path pointint to the .txt file which contains the patient label information.
label_type: string, default None: The name of the label predicted by the estimator. If None, the first label from the prediction file will be used.
scores: string, default percentages: Type of scoring to be used. Either ‘posteriors’ or ‘percentages’.
images: list, optional: List containing the filepaths to the ITKImage image files of the patients.
segmentations: list, optional: List containing the filepaths to the ITKImage segmentation files of the patients.
ensemble_method: string, optional: Method to be used for ensembling.
ensemble_size: int, optional: If top_N method is used, number of workflows to be included in ensemble.
output_csv: filepath, optional: If given, the scores will be written to this csv file.
output_zip: filepath, optional: If given, the images will be plotted and the pngs saved to this zip file.
output_itk: filepath, optional: WIP

`plotminmaxresponse` Module¶

WORC.plotting.plotminmaxresponse.main()[source]¶

plotting Package¶