Carrot GuideMetadata Profiling

Stage 2: Metadata Profiling

The understand a dataset’s metadata the WhiteRabbit statistical profiling tool from OHDSI is used. This tool has been vetted by UoN Information Security and Compliance team and passed the UoN DPIA process.

A full User Guide and Introduction to WhiteRabbit is here:

Profile Data

1. Setup WhiteRabbit

  • Open the WhiteRabbit tool.

Metadata extraction for csv files:

  • Locations screen:
    • Enter ‘Working Folder’ location (see image below).
    • Click ‘Test Connection’ to ensure that it is working.
    • Then go to the ‘Scan’ tab Screen.
Profiling data in metadata profiling section

Figure 10: Profiling Data in Metadata Profiling.

Example of WhiteRabbit 'location' screen option

Figure 11: Example of WhiteRabbit ‘Location’ Screen Option

1.1 Setup WhiteRabbit (Second Part)

  • Scan Screen:

    • Click on the ‘Scan’ tab.
    • Click ‘Add’, to add the required dataset .csv files.
    • The ‘Min Cell Count’ is set to 5 by default, this can be altered according to the data owner’s requirements.
    • Press ‘Scan Tables’ to run WhiteRabbit.
    • This will produce a ‘Scan Report’ for the dataset metadata within the Working Directory.
  • For Databases

Example of WhiteRabbit's 'Scan' screen option.

Figure 12: Example of WhiteRabbit’s ‘Scan’ Screen Option

Checking Scan Report

Examine Scan Report

  • Open the WhiteRabbit Scan Report Excel file from the Working Directory.
  • Check:
    • A sheet has been created for each .csv file added to WhiteRabbit.
    • Check the name of each sheet tab and ensure it is the same as the .csv file name it represents.
    • Check Field Overview sheet results are correct.

Check Anonymisation

  • Check all Direct Identifier data values are not present within the Scan Report.
  • Check all Indirect Identifier data values are not present within the Scan Report.

See Personally Identifiable Information for Direct and Indirect Identifier information.

Checking Scan report in metadata profiling

Figure 13: Checking Scan Report in Metadata Profiling.

Example of WhiteRabbit Scan Report - Aspects to Check

Figure 14: Example of WhiteRabbit Scan Report - Aspects to Check.