# MARTe Development Tools ## Project goal - A single portable executable named `mdt` (MARTe Development Tools) written in _Go_ - It should parse and index a custom configuration language (MARTe) - It should provide an LSP server for this language - It should provide a build tool for unifying multi-file projects into a single configuration output ## CLI Commands The executable should support the following subcommands: - `lsp`: Starts the Language Server Protocol server. - `build`: Merges files with the same base namespace into a single output. - `check`: Runs diagnostics and validations on the configuration files. - `fmt`: Formats the configuration files. ## LSP Features The LSP server should provide the following capabilities: - **Diagnostics**: Report syntax errors and validation issues. - **Hover Documentation**: - **Objects**: Display `CLASS::Name` and any associated docstrings. - **Signals**: Display `DataSource.Name TYPE (SIZE) [IN/OUT/INOUT]` along with docstrings. - **GAMs**: Show the list of States where the GAM is referenced. - **Referenced Signals**: Show the list of GAMs where the signal is referenced. - **Go to Definition**: Jump to the definition of a reference, supporting navigation across any file in the current project. - **Go to References**: Find usages of a node or field, supporting navigation across any file in the current project. - **Code Completion**: Autocomplete fields, values, and references. - **Code Snippets**: Provide snippets for common patterns. - **Formatting**: Format the document using the same rules and engine as the `fmt` command. ## Build System & File Structure - **File Extension**: `.marte` - **Project Structure**: Files can be distributed across sub-folders. - **Namespaces**: The `#package` macro defines the namespace for the file. - **Single File Context**: If no `#package` is defined in a file, the LSP, build tool, and validator must consider **only** that file (no project-wide merging or referencing). - **Semantic**: `#package PROJECT_NAME.SUB_URI` implies that: - `PROJECT_NAME` is a namespace identifier used to group files from the same project. It does **not** create a node in the configuration tree. - `SUB_URI` defines the path of nodes where the file's definitions are placed. All definitions within the file are treated as children/fields of the node defined by `SUB_URI`. - **URI Symbols**: The symbols `+` and `$` used for object nodes are **not** written in the URI of the `#package` macro (e.g., use `PROJECT.NODE` even if the node is defined as `+NODE`). - **Build Process**: - The build tool merges all files sharing the same base namespace into a **single output configuration**. - **Namespace Consistency**: The build tool must verify that all input files belong to the same project namespace (the first segment of the `#package` URI). If multiple project namespaces are detected, the build must fail with an error. - **Target**: The build output is written to a single target file (e.g., provided via CLI or API). - **Multi-File Definitions**: Nodes and objects can be defined across multiple files. The build tool, validator, and LSP must merge these definitions (including all fields and sub-nodes) from the entire project to create a unified view before processing or validating. - **Global References**: References to nodes, signals, or objects can point to definitions located in any file within the project. - **Merging Order**: For objects defined across multiple files, the **first file** to be considered is the one containing the `Class` field definition. - **Field Order**: Within a single file, the relative order of defined fields must be maintained. - The LSP indexes only files belonging to the same project/namespace scope. - **Output**: The output format is the same as the input configuration but without the `#package` macro. ## MARTe Configuration Language ### Grammar - `comment` : `//.*` - `configuration`: `definition+` - `definition`: `field = value | node = subnode` - `field`: `[a-zA-Z][a-zA-Z0-9_\-]*` - `node`: `[+$][a-zA-Z][a-zA-Z0-9_\-]*` - `subnode`: `{ definition+ }` - `value`: `string|int|float|bool|reference|array` - `int`: `/-?[0-9]+|0b[01]+|0x[0-9a-fA-F]+` - `float`: `-?[0-9]+\.[0-9]+|-?[0-9]+\.?[0-9]*e\-?[0-9]+` - `bool`: `true|false` - `string`: `".*"` - `reference` : `string|.*` - `array`: `{ value }` #### Extended grammar - `package` : `#package URI` - `URI`: `PROJECT | PROJECT.PRJ_SUB_URI` - `PRJ_SUB_URI`: `NODE | NODE.PRJ_SUB_URI` - `docstring` : `//#.*` - `pragma`: `//!.*` ### Semantics - **Nodes (`+` / `$`)**: The prefixes `+` and `$` indicate that the node represents an object. - **Constraint**: These nodes _must_ contain a field named `Class` within their subnode definition (across all files where the node is defined). - **Signals**: Signals are considered nodes but **not** objects. They do not require a `Class` field. - **Pragmas (`//!`)**: Used to suppress specific diagnostics. The developer can use these to explain why a rule is being ignored. Supported pragmas: - `//!unused: REASON` or `//!ignore(unused): REASON` - Suppress "Unused GAM" or "Unused Signal" warnings. - `//!implicit: REASON` or `//!ignore(implicit): REASON` - Suppress "Implicitly Defined Signal" warnings. - `//!allow(WARNING_TYPE): REASON` or `//!ignore(WARNING_TYPE): REASON` - Global suppression for a specific warning type across the whole project (supported: `unused`, `implicit`). - `//!cast(DEF_TYPE, CUR_TYPE): REASON` - Suppress "Type Inconsistency" errors if types match. - **Structure**: A configuration is composed by one or more definitions. - **Strictness**: Any content that is not a valid comment (or pragma/docstring) or a valid definition (Field, Node, or Object) is **not allowed** and must generate a parsing error. ### Core MARTe Classes MARTe configurations typically involve several main categories of objects: - **State Machine (`StateMachine`)**: Defines state machines and transition logic. - **Real-Time Application (`RealTimeApplication`)**: Defines a real-time application, including its data sources, functions, states, and scheduler. - **Data Source**: Multiple classes used to define input and/or output signal sources. - **GAM (Generic Application Module)**: Multiple classes used to process signals. - **Constraint**: A GAM node must contain at least one `InputSignals` sub-node, one `OutputSignals` sub-node, or both. ### Signals and Data Flow - **Signal Definition**: - **Explicit**: Signals defined within the `DataSource` definition. - **Implicit**: Signals defined only within a `GAM`, which are then automatically managed. - **Requirements**: - All signal definitions **must** include a `Type` field with a valid value. - **Size Information**: Signals can optionally include `NumberOfDimensions` and `NumberOfElements` fields. If not explicitly defined, these default to `1`. - **Property Matching**: Signal references in GAMs must match the properties (`Type`, `NumberOfElements`, `NumberOfDimensions`) of the defined signal in the `DataSource`. - **Extensibility**: Signal definitions can include additional fields as required by the specific application context. - **Signal Reference Syntax**: - Signals are referenced or defined in `InputSignals` or `OutputSignals` sub-nodes using one of the following formats: 1. **Direct Reference (Option 1)**: ``` SIGNAL_NAME = { DataSource = DATASOURCE_NAME // Other fields if necessary } ``` In this case, the GAM signal name is the same as the DataSource signal name. 2. **Aliased Reference (Option 2)**: ``` GAM_SIGNAL_NAME = { Alias = SIGNAL_NAME DataSource = DATASOURCE_NAME // ... } ``` In this case, `Alias` points to the DataSource signal name. - **Implicit Definition Constraint**: If a signal is implicitly defined within a GAM, the `Type` field **must** be present in the reference block to define the signal's properties. - **Directionality**: DataSources and their signals are directional: - `Input` (IN): Only providing data. Signals can only be used in `InputSignals`. - `Output` (OUT): Only receiving data. Signals can only be used in `OutputSignals`. - `Inout` (INOUT): Bidirectional data flow. Signals can be used in both `InputSignals` and `OutputSignals`. - **Validation**: The tool must validate that signal usage in GAMs respects the direction of the referenced DataSource. ### Object Indexing & References The tool must build an index of the configuration to support LSP features and validations: - **GAMs**: Referenced in `$APPLICATION.States.$STATE_NAME.Threads.$THREAD_NAME.Functions` (where `$APPLICATION` is a `RealTimeApplication` node). - **Signals**: Referenced within the `InputSignals` and `OutputSignals` sub-nodes of a GAM. - **DataSources**: Referenced within the `DataSource` field of a signal reference/definition. - **General References**: Objects can also be referenced in other fields (e.g., as targets for messages). ### Validation Rules - **Consistency**: The `lsp`, `check`, and `build` commands **must share the same validation engine** to ensure consistent results across all tools. - **Global Validation Context**: - All validation steps must operate on the aggregated view of the project. - A node's validity is determined by the combination of all its fields and sub-nodes defined across all project files. - **Class Validation**: - For each known `Class`, the validator checks: - **Mandatory Fields**: Verification that all required fields are present. - **Field Types**: Verification that values assigned to fields match the expected types (e.g., `int`, `string`, `bool`). - **Field Order**: Verification that specific fields appear in a prescribed order when required by the class definition. - **Conditional Fields**: Validation of fields whose presence or value depends on the values of other fields within the same node or context. - **Schema Definition**: - Class validation rules must be defined in a separate schema file. - **Project-Specific Classes**: Developers can define their own project-specific classes and corresponding validation rules, expanding the validation capabilities for their specific needs. - **Schema Loading**: - **Default Schema**: The tool should look for a default schema file `marte_schema.json` in standard system locations: - `/usr/share/mdt/marte_schema.json` - `$HOME/.local/share/mdt/marte_schema.json` - **Project Schema**: If a file named `.marte_schema.json` exists in the project root, it must be loaded. - **Merging**: The final schema is a merge of the built-in schema, the system default schema (if found), and the project-specific schema. Rules in later sources (Project > System > Built-in) append to or override earlier ones. - **Duplicate Fields**: - **Constraint**: A field must not be defined more than once within the same object/node scope, even if those definitions are spread across different files. - **Multi-File Consideration**: Validation must account for nodes being defined across multiple files (merged) when checking for duplicates. ### Formatting Rules The `fmt` command must format the code according to the following rules: - **Indentation**: 2 spaces per indentation level. - **Assignment**: 1 space before and after the `=` operator (e.g., `Field = Value`). - **Comments**: - 1 space after `//`, `//#`, or `//!`. - Comments should "stick" to the next definition (no empty lines between the comment and the code it documents). - **Placement**: - Comments can be placed inline after a definition (e.g., `field = value // comment`). - Comments can be placed after a subnode opening bracket (e.g., `node = { // comment`) or after an object definition. - **Arrays**: 1 space after the opening bracket `{` and 1 space before the closing bracket `}` (e.g., `{ 1 2 3 }`). - **Strings**: Quoted strings must preserve their quotes during formatting. ### Diagnostic Messages The LSP and `check` command should report the following: - **Warnings**: - **Unused GAM**: A GAM is defined but not referenced in any thread or scheduler. (Suppress with `//!unused`) - **Unused Signal**: A signal is explicitly defined in a `DataSource` but never referenced in any `GAM`. (Suppress with `//!unused`) - **Implicitly Defined Signal**: A signal is defined only within a `GAM` and not in its parent `DataSource`. (Suppress with `//!implicit`) - **Errors**: - **Type Inconsistency**: A signal is referenced with a type different from its definition. (Suppress with `//!cast`) - **Size Inconsistency**: A signal is referenced with a size (dimensions/elements) different from its definition. - **Invalid Signal Content**: The `Signals` container of a `DataSource` contains invalid elements (e.g., fields instead of nodes). - **Duplicate Field Definition**: A field is defined multiple times within the same node scope (including across multiple files). - **Validation Errors**: - Missing mandatory fields. - Field type mismatches. - Grammar errors (e.g., missing closing brackets). - **Invalid Function Reference**: Elements in the `Functions` array of a `State.Thread` must be valid references to defined GAM nodes. ## Logging - **Requirement**: All logs must be managed through a centralized logger. - **Output**: Logs should be written to `stderr` by default to avoid interfering with `stdout` which might be used for CLI output (e.g., build artifacts or formatted text).