The Avro output step serializes data into an Avro binary or JSON format from the PDI data stream, then writes it to file. Apache Avro is a data serialization system. Avro relies on schema for decoding binary and extracting data.
This output step creates the following files:
- A file containing output data in the Avro format
- An Avro schema file defined by the fields in this step
Fields can be defined manually or extracted from incoming steps.
Select an engine
You can run the Avro Output step on the Pentaho engine or on the Spark engine. Depending on your selected engine, the transformation runs differently. Select one of the following options to view how to set up the Avro Output step for your selected engine.
- Using the Avro Output step on the Pentaho engine: Learn how to set up this step when using the Pentaho engine.
- Using the Avro Output step on the Spark engine: Learn how to set up this step when using the Spark engine.
For instructions on selecting an engine for your transformation, see Run configurations.