Skip to content

Add streaming csv processing support to Csv Data module #8511

@Nuvindu

Description

@Nuvindu

Problem

Currently, when processing a byte stream from a file source, the entire file is loaded into memory before parsing it as CSV.

stream<byte[], io:Error?> csvByteStream = check io:fileReadBlocksAsStream("example.csv");
record {int id; string name;}[] csv1 = check csv:parseStream(csvByteStream);

Proposed Solution

This approach will be improved by enabling record-by-record parsing, allowing CSV data to be streamed efficiently without loading the full file into memory.

stream<byte[], io:Error?> csvByteStream = check io:fileReadBlocksAsStream("example.csv");
stream<record {int id; string name;}, io:Error?> csv1 = check csv:parseAsStream(csvByteStream);

foreach var rec in csv1 {
    // Process individual records
}

This enhancement makes the CSV parsing process significantly more memory-efficient and scalable for handling large files.

Alternatives

No response

Version

2201.12.0

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions