Terality Docs
  • What is Terality?
  • Documentation
    • Quickstart
      • Setup
      • Tutorial
      • Next steps
    • User guide
      • Supported configurations
      • User dashboard
      • Importing and exporting data
        • Reading/writing to storage
        • Reading from multiple files
        • Writing to multiple files
        • Storage services
        • Data formats
        • From/to pandas
      • .apply() and passing callables
      • Caching
      • Best practices and anti-patterns
      • Upgrading your client version
      • Client configuration (CLI)
      • Support for external packages
      • Advanced topics
        • Data mutability and ownership: Terality vs pandas
    • API Reference
      • Conversion from/to pandas
      • Write to multiple files
    • Deploy Terality in your own AWS account
    • Releases
  • FAQ
    • Differences with
      • Terality and Pandas
      • Terality vs Spark
      • Terality vs Dask
    • Pricing
    • Security
    • Support & contact
    • Common setup issues
Powered by GitBook
On this page
  • Components
  • Data ownership in a self-hosted deployment
  • Permissions you grant to Terality in a self-hosted deployment
  • How is the self-hosted version deployed?

Was this helpful?

  1. Documentation

Deploy Terality in your own AWS account

PreviousWrite to multiple filesNextReleases

Last updated 3 years ago

Was this helpful?

The Terality solution can be deployed in your own AWS account. With such a self-hosted deployment, your data never leaves your AWS account. This may help you comply with data protection requirements.

Self-hosted deployments are available on the Ultimate plan. If you're interested in this feature, please sign up here .

This page describes the architecture of a self-hosted Terality solution.

Components

The Terality solution includes four components:

  • The client library, as implemented by the terality Python package. This library sends computation requests to Terality servers.

  • The scheduler: the component responsible for receiving computation requests, and preparing execution plans for each computation. This scheduler only accesses metadata about the processed data, never the data itself.

  • Workers: execution units that read data and run the actual computations.

  • a storage layer to store the data. This storage layer is implemented by AWS S3. However, Terality can import data from many storage solutions, such as Azure Data Lake.

On its SaaS platform, Terality deploys and maintains all components.

In a self-hosted deployment, the workers and the storage layer are moved to your AWS account. The scheduler is still deployed by Terality, and the client library is unchanged.

Data ownership in a self-hosted deployment

As described above, in a self-hosted deployment, Terality workers run in your AWS account, and the data is stored in a S3 bucket in your AWS account.

The Terality scheduler still runs in a Terality AWS account, but has no access to the data S3 bucket. The Terality AWS account doesn't have permissions to read data from your data S3 bucket.

Workers that read from and write to the data S3 bucket run into your AWS account. As such, your data never leaves your AWS account.

The Terality scheduler only accesses metadata, such as the number of rows in a dataset, or number and names of columns in a dataset.

Permissions you grant to Terality in a self-hosted deployment

The Terality AWS account will be granted the following permissions:

  • permissions to launch Terality workers in your account, to reproduce the dynamic autoscaling of the hosted version. This means being able to create and invoke AWS Lambda functions, manage AWS EC2 virtual machines and autoscaling groups, and create and manage AWS ECS clusters in your account.

  • permissions to read observability data from AWS CloudWatch, in order to be able to monitor the health of your deployment .

How is the self-hosted version deployed?

Deploying a self-hosted version of Terality only takes a few minutes. We provide you with a CloudFormation template that creates all the required AWS resources in your account.

Once the template is deployed, the Terality scheduler will create and autoscale workers in your AWS account as needed. No further management is needed from your part.

Should you wish to revoke access to your data, simply delete the CloudFormation template from your account.

https://www.terality.com/sign-up
Terality self-hosted deployment architecture