Uploading a gz file without manual effort

For a use case we’re getting a csv file from a customer in the form of a gz file (zipped file). Are there any best practices in OD to upload this file without needing to manually unzip it first?

I saw in the Data Tables section that only CSV and ZIP files are supported.


Since ZIP is currently the most common archive-compression, we chose the format back in the days when we integrated support for compressed archive upload. It’s currently the only supported compression format for uploads.

However, Spark is capable of reading gz compressed files natively. Have you tried using a FileSystem Connection and directly uploading the gz to a mounted volume on server side (or have the customer upload it there)? I cannot guarantee it to work out of the box but chances are quite high.

If you want (or have to) upload it via our web interface or the API and want to reduce manual effort you have two options:

  1. Request gz support for uploads as a feature
  2. Automate the conversion process on your machine using a script

Since gz is streamable and so is ZIP, you should also be able to use a streaming conversion without unpacking the whole file as an intermediate step, reducing pressure on your local file system when opting for 2 and using proper tooling.