Carrot-Transform Quick Start Guide

Installation

Carrot-Transform is available on PyPI, so you can install it with:

pip install carrot-transform

If you are working with the source code, refer to the Development Notes

Running Carrot-Transform

To execute Carrot-Transform, run:

carrot-transform [command] [options]

For example, you can get the version number with:

carrot-transform -v

There are many mandatory and optional arguments for carrot transform. In the quick start, we will demonstrate the mandatory arguments on a test case (taken from carrot-CDM) included in the repository. Enter the following (as one command):

Basic Example

To process a test dataset included in the repository, run:

carrot-transform run mapstream \
  --input-dir @carrot/examples/test/inputs \
  --rules-file @carrot/examples/test/rules/rules_14June2021.json \
  --person-file @carrot/examples/test/inputs/Demographics.csv \
  --output-dir carrottransform/examples/test/test_output \
  --omop-ddl-file @carrot/config/OMOPCDM_postgresql_5.3_ddl.sql \
  --omop-config-file @carrot/config/omop.json

The ‘@carrot’ is an alias to the folder containing the carrot-transform module, which can be used with either installation method. When using your own files, use your file path, and omit this.

This will generate a set of output files in this directory:

carrottransform/examples/test/test_output

If it doesn’t exist, this directory should be created for you.

Arguments

Required Arguments

FlagDescription
--input-dirDirectory containing input files
--rules-fileJSON file with mapping rules
--person-fileCSV file where the first column contains person IDs. This is used as a registry for all valid person IDs. Person IDs not found in this registry will be omitted from all OMOP tables. This file can also be used as an input file, if it has additional data, and is located in the input directory.
--output-dirDirectory for OMOP-format TSV files

OMOP Configuration (Choose One Approach)

ApproachRequired Arguments
Specify Files--omop-ddl-file (DDL statements for OMOP tables) and --omop-config-file (override JSON config)
Specify Version--omop-version (e.g., 5.3, which will automatically find carrottransform/config/omop.json and carrottransform/config/OMOPCDM_postgresql_XX_ddl.sql)

Optional Arguments

FlagDefaultDescription
--write-modewSet to w (overwrite) or a (append) for output files
--saved-person-id-fileNonePath to a file to save and share person_id state
--use-input-person-idsNUse input person IDs (Y) or replace with new integers (N)
--last-used-ids-fileNonePath to a file tracking last used IDs (tab-separated format)
--log-file-threshold0Change output limit for log files