Additional functionality¶
When using SimpleWORC
, or WORC with similar simple configuration settings, you can
already benefit from the main functionality of WORC, i.e. the automatic algorithm
optimization. However, several additional functionalities are provided, which are discussed in
this chapter.
For a description of the radiomics features, please see the radiomics features chapter. For a description of the data mining components, see the data mining chapter. All other components are discussed here.
For a comprehensive overview of all functions and parameters, please look at the config chapter.
Image Preprocessing¶
Preprocessing of the image, and accordingly the mask, is done in respectively
the WORC.processing.preprocessing
and the
WORC.processing.segmentix
scripts. Options for preprocessing
the image include, in the following order:
N4 Bias field correction, see also https://simpleitk.readthedocs.io/en/master/link_N4BiasFieldCorrection_docs.html.
Checking and optionally correcting the spacing if it’s 1x1x1 and the DICOM metadata says otherwise.
Clipping of the image intensities above and below a certain value.
Normalization, see
WORC.processing.preprocessing.normalize_image
for all options.Transposing the image to another ‘’main’’ orientation, e.g. axial.
Resampling the image to a different spacing.
Options for preprocessing the segmentation include:
Hole filling. Many feature computations cannot deal with holes.
Removing small objects. Many feature computations cannot deal with multiple
objects in a single segmentation.
Extracing the largest blob. Many feature computations cannot deal with
multiple objects in a single segmentation.
Instead of using the full segmentation, extracting a ring around the border
of the image to compute the features on. Ring captures both the inner and outer border.
Dilating the contour.
Masking the contour with another contour.
When assuming the same image and metadata, copy the metadata of the image
to the segmentation.
Checking and optionally correcting the spacing if it’s 1x1x1 and the
DICOM metadata says otherwise. Same as image preprocessing step 2.
Transposing the segmentation to another ‘’main’’ orientation, e.g. axial.
Same as image preprocessing step 5.
Resampling the segmentation and the segmentation to a different spacing.
Same as image preprocessing step 10.
Image Registration¶
Documentation WIP.
ComBat¶
ComBat feature harmonization is embedded in WORC. A wrapper, compatible with the other tools provided by WORC, is included in the installation. We have included wrapping around the Matlab and Python code (neurocombat) from the original ComBat code. We recommend to use the Python code by default.
When using ComBat, the following configurations should be done:
Set
config['General']['ComBat']
to'True'
.To change the ComBat parameters (i.e. which batch and moderation variable to use), change the relevant config fields, see the Config chapter.
WORC extracts the batch and moderation variables from the label file which you also use to give WORC the actual label you want to predict. The same format therefore applies, see the User manual for more details..
Note
In line with current literature, ComBat is applied once on the full dataset straight after the feature extraction, thus before the actual hyperoptimization. Hence, to avoid serious overfitting, we advice to NEVER use the variable you are trying to predict as the moderation variable.
ICC¶
Documentation WIP.
Additional classifiers¶
When using the XGDBoost classifiers or regressors, install xgdboost
,
which can be done using pip
, see https://xgboost.readthedocs.io/en/latest/python/python_intro.html.
WORC
makes use of the scikit-learn API.
SMAC¶
Note
The SMAC algorithm only works on Linux, because of its random forest surrogate model
implementation. Make sure to use swig3.0
. To circumvent pyrfr
issues
with SMAC, we use a custom SMAC package that needs to be installed separately.
Steps to take in order to use SMAC within WORC:
sudo apt-get remove swig
sudo apt-get install swig3.0
sudo ln -s /usr/bin/swig3.0 /usr/bin/swig
pip install pyrfr==0.8.0
pip install git+https://github.com/mitchelldeen/SMAC3.git
The SMAC algorithm, using Bayesian optimization, can be used for the hyperparameter optimization by
setting the config['SMAC']['use']
parameter to 'True'
. For details on which SMAC parameters
can be modified, see Config chapter.
The core functionality of SMAC within WORC is implemented in
WORC.resources.fastr_tools.worc.bin.smac_tool
. The configuration space of SMAC is specified
in WORC.classification.smac
, which is also where new methods can be added to the search space.
There is additional output when using SMAC. The final output file smac_results_all_0.json
is added along with the regular performance file in the output folder. It contains information on the
optimization procedure for each cross-validation split, with statistics on the performance and all
intermediate best found configurations.The end of the file contains a summary of the average statistics
over all train-test cross-validations.