Developer documentation

Adding a feature processing toolbox

We suggest to use the wrapping we did around the PyRadiomics toolbox as an example.

  1. Check if config type is in fastr.types: else, make your own. See the WORC datatypes for examples.

  2. Make a fastr tool, which basically is a XML wrapper around your tool telling fastr what the inputs and outputs are. You can take our wrappers around pyradiomics as example: see the command line wrapper for when your toolbox is command line executable, or the Python wrapper for when your toolbox needs an interpreter such as Python. In the latter case, you will also need to include an actual script using the interpreter, which is by default placed in the bin folder together with the wrapper. The script will need to parse the arguments defined in the XML, see here for the one using PyRadiomics.

    For more information, visit the fastr documentation.

  3. In WORC, go to the WORC.WORC.WORC.add_feature_calculator function and change the following:

  1. Add a converter for your config file if you do not use the .ini format WORC by default uses.

  2. If you followed a, also add a config safe function to WORC.WORC.WORC.save_config.

  3. Make sure you add both the sources and sinks for your tools.

  4. Link these sources and sinks to the fastr network in the WORC.WORCWORC.set function.

  5. Add part to feature converter

  1. To convert the feature files from your toolbox to WORC compatible format, add the neccesary functionality to the WORC.featureprocessing.FeatureConverter function.

  2. Tell WORC to use your feature calculator by changing the relevant config field: config['General']['FeatureCalculators].

  3. Optionally, add the parameters specifically for your toolbox to the WORC configuration in WORC.WORC.WORC.defaultconfig.

Testing and example data

WIP

Adding methods to hyperoptimization

  1. Add the parameters to the relevant parts of the WORC configuration in WORC.WORC.WORC.defaultconfig.

    Please add a description of these parameters and their potential values to the documentation, see https://github.com/MStarmans91/WORC/blob/master/WORC/doc/generate_config.py.

  2. Make sure these fields are adequately parsed by adding parsing to WORC.IOparser.config_io_classifier.load_config.

  3. Define how the parameters you just added are added to the search space in WORC.classification.trainclassifier.trainclassifier, or WORC.classification.construct_classifier both the construct_classifier and create_param_grid functions for machine learning estimators (e.g. classifiers). For example, your parameters may define the actual options, or the min-max of a distribution (e.g. uniform, log-uniform). See the referred function for some examples. Make sure that the object is iterable.

  4. These parameters will end up in the function fitting the workflow: WORC.classification.fitandscore.fit_and_score. Hence, add a part to that function to embed your method in the workflow. We advice you to embed your method in a sklearn compatible class, having init, fit and transform functions. See for example WORC.featureprocessing.Preprocessor.Preprocessor.

    In WORC.classification.fitandscore.fit_and_score, make sure that after fitting your object, the parameters used are deleted from the config, as is done for the other methods as well.

    Lastly, in WORC.classification.fitandscore.fit_and_score, make sure the fitted object is returned. We recommend looking at the imputer object and similarly including your object.

    This is given to various objects in the WORC.classification.SearchCV module. Therefore, add the returned object to all the parts were fitted objects are used: we recommend looking everywhere the imputer is stated in WORC.classification.SearchCV, copying those five statements and replace imputer with however you called your methods. You can see that this is also similar to e.g. the scaler, pca, and groupsel objects.

  5. If you want your new method to be used by the SimpleWORC or a child facade, check WORC.facade.SimpleWORC to see if you need to add it, e.g. whitelist a classifier.

Adding a (plotting) tool to the WORC evaluation pipeline

We illustrate here how plotting the ROC curves is embedded in WORC, and to follow or even copy-paste this example to add your own tools.

  1. Write a script to perform the actual analysis, preferably stored in the plotting subfolder. See for example the one of the ROC curves.

  2. Make it command-line executable and able to parse input arguments. For the ROC curves, we did this in the above script as well, but that is optional. It should be stored in the WORC fastr_tools folder, see also the ROC script in that folder, for step 3. Make sure it both is able to take in input arguments (parameters, file names) and output arguments (file names) to store the results in.

  3. Make a fastr tool, i.e., a wrapper around your main function that fastr can call. See the general fastr documentation on creating your own tool, and the WORC tool for the ROC curve plotting< https://github.com/MStarmans91/WORC/blob/master/WORC/resources/fastr_tools/worc/PlotROC.xml/`>_. You can see we call the script from step 2 for this. For the input and output files, you can run fastr.types to see which datatypes are in your current fastr installation. You an see we added several in WORC, see also the WORC fastr_types folder.

  4. Now add it the the Evaluation part of WORC. If you are new to creating fastr networks, you may want to check out the fastr documentation, but in principle you can just copy paste again the parts of the plotting of the ROC curve. Make sure you add: the additional sources (inputs) your tool requires if they are not already in the rest of WORC, the actual tool you made, and sinks (outputs) so the output is also actually stored when your tool is done in an output folder.