Terality Docs
  • What is Terality?
  • Documentation
    • Quickstart
      • Setup
      • Tutorial
      • Next steps
    • User guide
      • Supported configurations
      • User dashboard
      • Importing and exporting data
        • Reading/writing to storage
        • Reading from multiple files
        • Writing to multiple files
        • Storage services
        • Data formats
        • From/to pandas
      • .apply() and passing callables
      • Caching
      • Best practices and anti-patterns
      • Upgrading your client version
      • Client configuration (CLI)
      • Support for external packages
      • Advanced topics
        • Data mutability and ownership: Terality vs pandas
    • API Reference
      • Conversion from/to pandas
      • Write to multiple files
    • Deploy Terality in your own AWS account
    • Releases
  • FAQ
    • Differences with
      • Terality and Pandas
      • Terality vs Spark
      • Terality vs Dask
    • Pricing
    • Security
    • Support & contact
    • Common setup issues
Powered by GitBook
On this page

Was this helpful?

  1. Documentation
  2. User guide
  3. Importing and exporting data

Writing to multiple files

PreviousReading from multiple filesNextStorage services

Last updated 3 years ago

Was this helpful?

Exporting data (multiple files)

When dealing with big data, it is often inconvenient to store all the data in a single huge file of several GBs (or worse tens, or hundreds of GBs). To help with this issue we added some functions to allow you to save your DataFrame over several_files:

df.to_parquet_folder(
    path="s3://my_bucket/my_key/part_*.parquet",
    num_rows_per_file=1_000_000,
)

Here by setting num_rows_per_file we specify the number of rows of each resulting file, or you could also specify the number of files or the in-memory size per file.

Check out the for all parameters or other formats (such as to_csv_folder).

API