A trusted solution
CSV has been a reliable data transfer format for decades. Many vendors support it, it can be read with simple tools in a memory-efficient way, and humans can sanity-check it when things go wrong. It's not a bad choice for sending data to Faraday.
You can send us CSVs in the following ways:
Upload to Amazon S3.
Upload to Google Cloud Storage (GCS).
Upload via SFTP. We use AWS Transfer.
Key to success: same columns every time
The most important factor in a successful CSV Source with Faraday is every CSV has the same columns, every time. So, it's best to set up some sort of automation on your end to ensure that the file is exported the same way every time. If you can do this, then the data will be automatically ingested on our end with no human intervention.
How to upload different datasets via CSV: different folders
Most clients want to send us different datasets. With CSV, you can do that by having different folders inside your cloud bucket:
Inside of every folder, every file must have the same columns.
Acme wants to send customers, transactions, coupon codes, and product lists to Faraday.
Either Faraday or Acme sets up an S3 bucket and gives the other party access.
Inside that bucket, we create multiple folders:
Acme sets up an automated export and upload of CSVs to these folders.
Acme or Faraday sets up one Source Table for each of these folders.
If this happens, then Acme's data will be synced continuously to Faraday's systems.
What data to include in the CSV
Please make sure to read What data Faraday expects. It will tell you what PII, dates, etc. we need and will help ensure your data ingestion will be as smooth as possible.
How to format CSVs
The first row should be headers
Dates in YYYY-MM-DD (aka ISO 8601)
Comma-delimited (not tab)
Cells with commas should be surrounded by double quotes
Double quotes inside of cells should be doubled up ("")
One email/address/phone per cell
Things to watch out for
If you have a folder in an S3 bucket, but it has files with different columns, then our system won't be able to automatically ingest and this may cause multiple days of delay for every ingestion.
If you have a CSV with multiple emails/addresses/etc. in the same field, we won't be able to ingest it