A new JSON format for streaming long JSON data: NDJSON

Posted: 20.9.2017 13.41.47 (EET/GMT+2)

When working with JSON data, one problem has always been how to handle large datasets efficiently. Traditional JSON requires the entire document to be read and parsed before any data can be processed. That's fine for small files, but once you start dealing with large logs, telemetry, or export files, it becomes a bottleneck.

Recently, I came across an interesting alternative called NDJSON (Newline-Delimited JSON). Instead of one big JSON array, each record is written as a single JSON object, and each is in turn separated by a newline character (LF or \n):

{"id": 1, "name": "Alice", "city": "London"}
{"id": 2, "name": "Bob", "city": "New York"}
{"id": 3, "name": "Charlie", "city": "Helsinki"}

The format also allows the CRLF sequence as the newline character combination: good for us Windows folks.

Each line is an independent JSON object. There are no enclosing [ or ] characters nor any commas at the end of each object. It's still valid JSON on each line; just in slightly smaller pieces.

This makes NDJSON great for:

Streaming data as it’s produced, instead of buffering it all first.
Appending new lines easily (try that with a regular JSON array!).
Processing large files line by line — even with simple tools like type, findstr, or PowerShell’s Get-Content.

It's also a natural fit for log files, ETL jobs, or any system that processes some kind of events continuously.

Here's a quick example for writing and reading NDJSON in .NET:

// writing NDJSON
StreamWriter writer = new StreamWriter("data.ndjson");
foreach (var record in records)
{
    string json = JsonSerializer.Serialize(record);
    writer.WriteLine(json);
}

// reading NDJSON
foreach (var line in File.ReadLines("data.ndjson"))
{
    var obj = JsonSerializer.Deserialize(line);
    Process(obj);
}

Because each line is a complete JSON object, you can safely process data incrementally — even as it’s being generated.

The format isn't new, but it's not widely known either. Many open-source data tools (like Elasticsearch and log shippers) already use it internally. For .NET developers, it could be a very practical way to handle continuous or large JSON data sources.

Since this format looks good, I need to test it later if an ASP.NET web application could stream that data format directly to a browser or client application — that might be an interesting future experiment.