top of page

ETL PIPELINE
Clean your
Unstructured Data

The Agnes ETL (Extract, Transform, Load) Pipeline is an automated and scalable approach to taking any kind of data and rendering it into a format that a Search Engine can operate on.

Agnes’ ETL Pipeline transforms disparate unstructured data formats into clean semi-structured JSON and XML.

XML, HTML, Docx, Xlsx, Pptx, PDF, RTF, Text, RSS/ATOM, CSV

Documents

Tiff, Png, Jpeg, Gif

Images

Mp4, M4a, M4v, F4v, F4a, M4b, M4r, F4b, Mov

Video

Alaw, Basic, Flac, G729, L16, Mp3, Mpeg, Mulaw, Ogg, Wav, Webm

Audio

Mbox, PST, Eml

Email

Supporting formats with the capability for added formats.

bottom of page