Carrot-Transform Quick Start Guide
Installation
Carrot-Transform is now available on PyPI, so you can install it with:
pip install carrot-transform
Alternatively, if you are working with the source code, install dependencies using Poetry:
poetry install
To install poetry, please follow the instructions here.
If using poetry with the source code, you will need to prepend the examples with ‘poetry run’. For example, poetry run carrot-transform -v
Running Carrot-Transform
To execute Carrot-Transform, run:
carrot-transform [command] [options]
For example, you can get the version number with:
carrot-transform -v
There are many mandatory and optional arguments for carrot transform. In the quick start, we will demonstrate the mandatory arguments on a test case (taken from carrot-CDM) included in the repository. Enter the following (as one command):
Basic Example
To process a test dataset included in the repository, run:
carrot-transform run mapstream \
--input-dir @carrot/examples/test/inputs \
--rules-file @carrot/examples/test/rules/rules_14June2021.json \
--person-file @carrot/examples/test/inputs/Demographics.csv \
--output-dir carrottransform/examples/test/test_output \
--omop-ddl-file @carrot/config/OMOPCDM_postgresql_5.3_ddl.sql \
--omop-config-file @carrot/config/omop.json
The ‘@carrot’ is an alias to the folder containing the carrot-transform module, which can be used with either installation method. When using your own files, use your file path, and omit this.
This will generate a set of output files in this directory:
carrottransform/examples/test/test_output
If it doesn’t exist, this directory should be created for you.
Arguments
Required Arguments
Flag | Description |
---|---|
--input-dir | Directory containing input files |
--rules-file | JSON file with mapping rules |
--person-file | CSV file where the first column contains person IDs |
--output-dir | Directory for OMOP-format TSV files |
OMOP Configuration (Choose One Approach)
Approach | Required Arguments |
---|---|
Specify Files | --omop-ddl-file (DDL statements for OMOP tables) and --omop-config-file (override JSON config) |
Specify Version | --omop-version (e.g., 5.3 , which will automatically find carrottransform/config/omop.json and carrottransform/config/OMOPCDM_postgresql_XX_ddl.sql ) |
Optional Arguments
Flag | Default | Description |
---|---|---|
--write-mode | w | Set to w (overwrite) or a (append) for output files |
--saved-person-id-file | None | Path to a file to save and share person_id state |
--use-input-person-ids | N | Use input person IDs (Y ) or replace with new integers (N ) |
--last-used-ids-file | None | Path to a file tracking last used IDs (tab-separated format) |
--log-file-threshold | 0 | Change output limit for log files |