# tt-npe API and Developer Guide ## Install Dir Structure Everything is installed to `tt-npe/install/`, including: - Shared library (`install/lib/libtt_npe.so`) - Headers C++ API (`install/include/`) - Python CLI using pybind11 (`install/bin/tt_npe.py`) - C++ CLI (`install/bin/tt_npe_run`) ## Unit Tests tt-npe has two unit test suites; one for C++ code and one for Python. ### Run All Tests ``` $ tt_npe/scripts/run_ut.sh # can be run from any pwd ``` ## Simulating Predefined Workloads Using tt_npe.py ### Environment Setup Run the following to add tt-npe install dir to your `$PATH`: ```shell cd tt-npe/ source ENV_SETUP # add /install/bin/ to $PATH ``` Now run the following: ```shell tt_npe.py -w tt_npe/workload/example_wl.json ``` **Note**: - the `-w` argument is *required*, and specifies the JSON workload file to load. - for a multichip workload, the individual device traces have to be merged and pre-processed using the `fabric_post_process.py` script ### Other Important Options Bandwidth derating caused by congestion between concurrent transfers is modelled **by default**. Congestion modelling can be *disabled* using `--cong-model none`. The `-e` option dumps detailed information about simulation timeline (e.g. congestion and transfer state for each timestep) into a JSON file located at `npe_timeline.json` (by default). This timeline files can be used with TT-NN Visualizer to visualize the the congestion and transfer states. See `tt_npe.py --help` for more information about available options. ## Constructing Workloads Programmatically tt-npe workloads are comprimised as collections of `Transfers`. Each `Transfer` represents a *series of back-to-back packets from one source to one or more destinations*. This is roughly equivalent to a single call to the dataflow APIs `noc_async_read` and `noc_async_write`. `Transfers` are grouped *hierarchically* (see diagram). Each workload is a collection of `Phases`, and each `Phase` is a group of `Transfers`. ![tt-npe workload hierarchy diagram](img/npe_workload_diag.png) __For most modelling scenarios, putting all `Transfers` in a single monolithic `Phase` is the correct approach__. The purpose of multiple `Phases` is to express data dependencies and synchronization common in real workloads. However full support for this is not yet complete. ### *Example - Constructing a Random Workload using Python API* See the example script `install/bin/programmatic_workload_generation.py` for an annotated example of generating and simulating a tt-npe NoC workload via Python bindings. ## Python API Documentation Open `tt_npe/doc/tt_npe_pybind.html` to see full documentation of the tt-npe Python API. ## C++ API The C++ API requires: 1. Including the header `install/include/npeAPI.hpp` 2. Linking to the shared lib `libtt_npe.so` ### Reference Example : C++ CLI (tt_npe_run) See the example C++ based CLI source code within `tt-npe/tt_npe/cli/`. This links `libtt_npe.so` as a shared library, and serves as a reference for interacting with the API.