This page describes the architecture of a self-hosted Terality solution.
The Terality solution includes four components:
The client library, as implemented by the terality Python package. This library sends computation requests to Terality servers.
The scheduler: the component responsible for receiving computation requests, and preparing execution plans for each computation. This scheduler only accesses metadata about the processed data, never the data itself.
Workers: execution units that read data and run the actual computations.
a storage layer to store the data. This storage layer is implemented by AWS S3. However, Terality can import data from many storage solutions, such as Azure Data Lake.
On its SaaS platform, Terality deploys and maintains all components.
In a self-hosted deployment, the workers and the storage layer are moved to your AWS account. The scheduler is still deployed by Terality, and the client library is unchanged.
Terality self-hosted deployment architecture
Data ownership in a self-hosted deployment
As described above, in a self-hosted deployment, Terality workers run in your AWS account, and the data is stored in a S3 bucket in your AWS account.
The Terality scheduler still runs in a Terality AWS account, but has no access to the data S3 bucket. The Terality AWS account doesn't have permissions to read data from your data S3 bucket.
Workers that read from and write to the data S3 bucket run into your AWS account. As such, your data never leaves your AWS account.
The Terality scheduler only accesses metadata, such as the number of rows in a dataset, or number and names of columns in a dataset.
Permissions you grant to Terality in a self-hosted deployment
The Terality AWS account will be granted the following permissions:
permissions to launch Terality workers in your account, to reproduce the dynamic autoscaling of the hosted version. This means being able to create and invoke AWS Lambda functions, manage AWS EC2 virtual machines and autoscaling groups, and create and manage AWS ECS clusters in your account.
permissions to read observability data from AWS CloudWatch, in order to be able to monitor the health of your deployment .
How is the self-hosted version deployed?
Deploying a self-hosted version of Terality only takes a few minutes. We provide you with a CloudFormation template that creates all the required AWS resources in your account.
Once the template is deployed, the Terality scheduler will create and autoscale workers in your AWS account as needed. No further management is needed from your part.
Should you wish to revoke access to your data, simply delete the CloudFormation template from your account.