.. _kernel-dev-flow: Kernel Development Flow ======================= Developing a kernel ------------------- Before proceeding, make sure you read the :ref:`computing-with-cerebras` section. .. admonition:: Scope of this section This section does not directly discuss how to write your kernel using a CSL program. However, you should read this section to understand, at a high-level, the steps to develop a kernel with CSL. Also see :ref:`cslang-guides`. CSL code and runtime script ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Developing a kernel constitutes the following: - Developing a CSL program, such as a ``.csl`` that defines the operations a PE or a set of PEs must perform. Often you will create multiple ``csl`` files. For example, see :ref:`02-multiple-source-files`. - Compiling the top-level ``.csl`` program with the ``cslc`` compiler. - Running the program, either with the simulator or on the Cerebras Wafer Scale Engine (WSE), using a runtime configuration script, usually written in Python, such as ``code.csl.run.py``. Here you will provide the input tensors to the simulator, and - When the simulation is complete, read the simulator output and compare against a reference for validating the program output. Kernel development steps ~~~~~~~~~~~~~~~~~~~~~~~~ The following diagrams show the sequence of steps for developing a kernel. **Step 1** .. _cslang-kernel-dev-flow1: .. figure:: images/cslang-kernel-dev-flow1.png :align: center :width: 750px In your CSL code you must explicitly: - Define a layout by using ``@set_rectangle()`` function. This defines a Rectangular region of contiguous processing elements (PEs). - For each PE, use the ``@set_tile_code()`` to define the code the PE will run. - Configure the routes and colors with ``@set_color_config()``. **Steps 2 and 3** .. _cslang-kernel-dev-flow2: .. figure:: images/cslang-kernel-dev-flow2.png :align: center :width: 750px - Next, you compile the top-level ``code.csl`` with the compiler tool ``cslc``. This will generate a binary ELF file for each PE. - Finally, use the runtime Python script ``code.csl.run.py`` to run the code either on the simulator or directly on the Cerebras WSE. .. note:: The above flow is the same when you are targeting the actual network-attached Cerebras WSE accelerator device, except that Run step will target the network-attached CS WSE accelerator, instead of the local fabric simulator. Example walkthrough ------------------- See :ref:`example-intro` and :ref:`working-with-code-samples`.