Terality Docs
  • What is Terality?
  • Documentation
    • Quickstart
      • Setup
      • Tutorial
      • Next steps
    • User guide
      • Supported configurations
      • User dashboard
      • Importing and exporting data
        • Reading/writing to storage
        • Reading from multiple files
        • Writing to multiple files
        • Storage services
        • Data formats
        • From/to pandas
      • .apply() and passing callables
      • Caching
      • Best practices and anti-patterns
      • Upgrading your client version
      • Client configuration (CLI)
      • Support for external packages
      • Advanced topics
        • Data mutability and ownership: Terality vs pandas
    • API Reference
      • Conversion from/to pandas
      • Write to multiple files
    • Deploy Terality in your own AWS account
    • Releases
  • FAQ
    • Differences with
      • Terality and Pandas
      • Terality vs Spark
      • Terality vs Dask
    • Pricing
    • Security
    • Support & contact
    • Common setup issues
Powered by GitBook
On this page
  • Importing data
  • Exporting data

Was this helpful?

  1. Documentation
  2. User guide
  3. Importing and exporting data

Reading/writing to storage

PreviousImporting and exporting dataNextReading from multiple files

Last updated 3 years ago

Was this helpful?

Importing data

Terality uses the same methods as pandas to load data, such as read_csv, read_parquet and similar. Example:

import terality as te

# Load all parquet files at this S3 location
df = te.read_parquet("s3://my-datasets/path/to/objects/")

# Load a CSV file from disk
df = te.read_csv("/path/to/my/data.csv")

You can import data just as you would do using pandas, for example using a read_csv or a read_parquet on your local file or your cloud storage (such as AWS S3). You can find the currently supported functions in the section.

You can also by specifying a folder path to the read method. This is supported for the following functions

  • read_csv

  • read_parquet

  • read_excel

  • read_json

Do not hesitate to contact us if you want us to implement any other read function.

In addition, Terality provides a way to convert pandas objects into Terality structures, using the method.

Exporting data

If you're done working on your DataFrame for the moment, or it's still too big to be held in memory on your computer, you may want to download and save it back on your computer's drive/cloud storage. To do this, you can simply use the same API as pandas:

# Forinstance, for AWS S3 and parquet:
df.to_parquet("s3://my_bucket/my_key/my_data.parquet")

Best practice: we recommend adopting a modern and scalable data workflow by using:

  • a cloud storage rather than local storage (to avoid having transfers being limited by your bandwidth)

  • a modern, fast, scalable and powerful data format such as parquet, rather than CSV

You can also using to_csv_folder or to_parquet_folder from the Terality API.

Data formats
read multiples files
from_pandas
export to multiple files