Carrot-Transform Quick Start Guide

Installation

Carrot-Transform is now available on PyPI, so you can install it with:

pip install carrot-transform

Alternatively, if you are working with the source code, install dependencies using Poetry:

poetry install

To install poetry, please follow the instructions here. If using poetry with the source code, you will need to prepend the examples with ‘poetry run’. For example, poetry run carrot-transform -v

Running Carrot-Transform

To execute Carrot-Transform, run:

carrot-transform [command] [options]

For example, you can get the version number with:

carrot-transform -v

There are many mandatory and optional arguments for carrot transform. In the quick start, we will demonstrate the mandatory arguments on a test case (taken from carrot-CDM) included in the repository. Enter the following (as one command):

Basic Example

To process a test dataset included in the repository, run:

carrot-transform run mapstream \
  --input-dir @carrot/examples/test/inputs \
  --rules-file @carrot/examples/test/rules/rules_14June2021.json \
  --person-file @carrot/examples/test/inputs/Demographics.csv \
  --output-dir carrottransform/examples/test/test_output \
  --omop-ddl-file @carrot/config/OMOPCDM_postgresql_5.3_ddl.sql \
  --omop-config-file @carrot/config/omop.json

The ‘@carrot’ is an alias to the folder containing the carrot-transform module, which can be used with either installation method. When using your own files, use your file path, and omit this.

This will generate a set of output files in this directory:

carrottransform/examples/test/test_output

If it doesn’t exist, this directory should be created for you.

Arguments

Required Arguments

FlagDescription
--input-dirDirectory containing input files
--rules-fileJSON file with mapping rules
--person-fileCSV file where the first column contains person IDs
--output-dirDirectory for OMOP-format TSV files

OMOP Configuration (Choose One Approach)

ApproachRequired Arguments
Specify Files--omop-ddl-file (DDL statements for OMOP tables) and --omop-config-file (override JSON config)
Specify Version--omop-version (e.g., 5.3, which will automatically find carrottransform/config/omop.json and carrottransform/config/OMOPCDM_postgresql_XX_ddl.sql)

Optional Arguments

FlagDefaultDescription
--write-modewSet to w (overwrite) or a (append) for output files
--saved-person-id-fileNonePath to a file to save and share person_id state
--use-input-person-idsNUse input person IDs (Y) or replace with new integers (N)
--last-used-ids-fileNonePath to a file tracking last used IDs (tab-separated format)
--log-file-threshold0Change output limit for log files