Flat File
A Flat File Source reads and parses structured files in multiple formats (CSV, positional, XML, JSON) from various acquisition channels (FTP, HTTP, S3, Azure Blob Storage).
Each row or logical unit becomes an independent message processed asynchronously for maximum scalability and flexibility.
Acquisition channels
| Channel | Description |
|---|---|
| FTP / FTPS / SFTP | Securely download files from an FTP server. Auth via user/password or SSH key; paths, filename patterns, retention rules supported. |
| HTTP (TBD) | Download from HTTP/HTTPS with possible authentication (API key, Basic, Bearer). |
| Amazon S3 (TBD) | Connect to a bucket and filter by prefix or filename. |
| Azure Blob Storage (TBD) | Access a container; auth via connection string or service principal. |
| Azure Blob Storage RFE | Extended variant for multi/coordination processing (in development). |
Reading protocols
Flowlyze supports two reading protocols:
Coordinated
Flowlyze uses a working directory with read/write permissions:
- Copy file from source directory to a temporary processing directory
- Read and parse contents
- Move file to a completion directory on finish
Advantages:
- Visibility on pending files
- Error inspection
- Full traceability
Directory cleanup and artifact handling are the user’s responsibility.
Simple (TBD)
Flowlyze downloads and processes the file directly, without local copies.
On errors, Flowlyze emits a flow error message with details but does not keep a copy of the original file.
Supported formats
| Format | Status | Description |
|---|---|---|
| CSV | Implemented | Delimited files with advanced parsing (header, quoting, culture, custom delimiters). |
| Positional | TBD | Fixed-width files with explicit column positions. |
| XML | TBD | XML with XPath selectors, node mapping. |
| JSON | TBD | Complex JSON with JSONPath entity selection. |
CSV configuration
| Parameter | Description |
|---|---|
| Line Delimiter | Usually \n or \r\n. |
| Column Delimiter | Default ,; e.g., ;. |
| Quote Character | Default ". |
| Culture | Number/date culture (e.g., it-IT, en-US). |
| Has Header | First row contains column names (true/false). |
| Grouping Column | Column used to group multiple rows into a single message. |
Row Grouping
Flowlyze can group multiple source rows into one logical message using a grouping column.
This is useful for hierarchical relations or multiple versions of the same record in a file.
During parsing, Flowlyze evaluates the grouping column (e.g., product_id, record_id).
Rows with the same value are aggregated into one JSON object.
Each group yields a single message containing common fields and a nested array with grouped rows.
Example 1 – Variants grouped by main product
CSV with product variants (size, color, price) associated with a main product identified by product_id.
Source file
| product_id | variant_id | color | size | price |
|---|---|---|---|---|
| 1001 | 1 | red | M | 29.90 |
| 1001 | 2 | blue | L | 31.50 |
| 1002 | 3 | black | S | 28.00 |
Grouping column: product_id
Aggregated output
{
"product_id": 1001,
"variants": [
{ "variant_id": 1, "color": "red", "size": "M", "price": 29.90 },
{ "variant_id": 2, "color": "blue", "size": "L", "price": 31.50 }
]
},
{
"product_id": 1002,
"variants": [
{ "variant_id": 3, "color": "black", "size": "S", "price": 28.00 }
]
}
Example 2 – Grouping multiple saves (journaling tables)
A journaling/audit table records multiple versions of the same record. The flow can consolidate them into one logical message.
Source file
| record_id | update_time | field | old_value | new_value |
|---|---|---|---|---|
| 501 | 2025-10-01 10:30:00 | status | draft | pending |
| 501 | 2025-10-02 11:45:00 | status | pending | approved |
| 501 | 2025-10-03 09:20:00 | note | null | "OK" |
Grouping column: record_id
Aggregated output
{
"record_id": 501,
"data": [
{
"update_time": "2025-10-01T10:30:00Z",
"field": "status",
"old_value": "draft",
"new_value": "pending"
},
{
"update_time": "2025-10-02T11:45:00Z",
"field": "status",
"old_value": "pending",
"new_value": "approved"
},
{
"update_time": "2025-10-03T09:20:00Z",
"field": "note",
"old_value": null,
"new_value": "OK"
}
]
}
In this example, Flowlyze emits one message per record_id, containing the complete change history in chronological order. This consolidates versions into a coherent representation for downstream systems (e.g., data lake, CRM, or auditing service).
JSON Configuration
For the JSONformat, Flowlyze supports reading structured JSON files, both simple and complex, with the ability to select a specific portion of the document using JSONPath.
The JSON configuration is intentionally minimal: the parsing behavior mainly depends on the structure of the selected node (object, array, or single value).
Configuration Parameters
| Parameter | Description |
|---|---|
| Json Path | JSONPath expression that identifies the node in the JSON document from which to read data. If not specified, left empty, or set to "$", the document root is used. |
Parsing Behavior
Once the target node is determined (root or node selected via JSONPath), Flowlyze generates one or more messages based on the node type:
| Node Type | Result |
|---|---|
| JSON Object | A single message is generated containing the entire object. |
| JSON Array | One message per array element is generated. |
| Single Value (string, number, boolean, etc.) | A single message is generated containing the value. |
This behavior is identical whether using the document root or a JSONPath.
Usage Without Json Path
If Json Path is not configured or is set to "$":
- The entire JSON file is read.
- The document root is used as the input node.
- Message generation depends on the root type:
- Object → 1 message
- Array → N messages (one per element)
Usage With Json Path
When Json Path is specified:
- The file is parsed as JSON.
- The JSONPath expression is applied to locate a specific node (for example, a nested array or object).
- If the path does not match any node, a configuration error is raised.
- The selected node becomes the input for message generation, following the same rules described above.
Configuration Examples
JSON Object as Root
{ "id": 1, "name": "Example" }
Configuration:
- Json Path: (not set)
Result:
- A single message is generated containing the complete JSON object.
JSON Array as Input
[
{ "id": 1 },
{ "id": 2 }
]
Configuration:
- Json Path: (not set)
Result:
- Two messages are generated, one for each array element.
Selection via Json Path
{
"data": {
"items": [
{ "id": 1 },
{ "id": 2 }
]
}
}
Configuration:
- Json Path:
$.data.items
Result:
- Two messages are generated, , one for each element of the
itemsarray.
Invalid Json Path
Configuration:
- Json Path:
$.missing.path
Result:
- Configuration error: the JSONPath does not match any node in the document.