Skip to main content
When importing from databases, many customers create separate tables/views for Aampe that are limited to only the data necessary for Aampe to function.

Connect Snowflake to Aampe (via Google Cloud Storage)

1. Create Snowflake Integration

Integrations are named, first-class Snowflake objects that avoid the need for passing explicit cloud provider credentials such as secret keys or access tokens; instead, integration objects reference a Cloud Storage service account. Aampe will provide you with the bucket link mentioned below.
CREATE STORAGE INTEGRATION gcs_int
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = GCS
ENABLED = TRUE
STORAGE_ALLOWED_LOCATIONS = ('gcs-bucket-link');

2. Obtain GCP Service Account

A GCP Service account is created by Snowflake as part of Step 1:
DESC STORAGE INTEGRATION gcs_int;
In the output of the above command, the STORAGE_GCP_SERVICE_ACCOUNT property contains the GCP service account name. Example output:
+-----------------------------+---------------+-----------------------------------------------------------------------------+------------------+
| property                    | property_type | property_value                                                              | property_default |
+-----------------------------+---------------+-----------------------------------------------------------------------------+------------------|
| ENABLED                     | Boolean       | true                                                                        | false            |
| STORAGE_ALLOWED_LOCATIONS   | List          | bucket-link-that-we-will-send-you                                          | []               |
| STORAGE_GCP_SERVICE_ACCOUNT | String        | [email protected]                            |                  |
+-----------------------------+---------------+-----------------------------------------------------------------------------+------------------+
We will take this service account and provide it permissions to add/modify data to the GCS bucket.

3. Create File Formats

A parquet file format can be created in Snowflake as follows to unload snowflake data in the parquet format. Any other format like CSV or JSON also can be created. Here is an example script to create parquet format:
USE DB_NAME;
USE SCHEMA_NAME;
CREATE OR REPLACE FILE FORMAT parquet_unload_file_format
TYPE = PARQUET
SNAPPY_COMPRESSION = TRUE
COMMENT = 'FILE FORMAT FOR UNLOADING AS PARQUET FILES';
More options and file types can be found here.

4. Create External Stage

In Snowflake, an EXTERNAL STAGE object references data files stored in a location outside of Snowflake.
USE DB_NAME;
USE SCHEMA SCHEMA_NAME;
CREATE OR REPLACE STAGE parquet_unload_gcs_stage
URL = 'gcs-bucket-link'
storage_integration = gcs_int
FILE_FORMAT = parquet_unload_file_format
COMMENT = 'GCS Stage for the Snowflake external Parquet export';

5. Unload Data

From Snowflake, use the COPY command in Snowflake to unload data from a Snowflake table into a GCS bucket. To retain the column names in the output file, use the HEADER = TRUE copy option.
COPY INTO @<EXTERNAL_STAGE_NAME>/<file_name_prefix>
FROM DB_NAME.SCHEMA_NAME.TABLE_NAME
HEADER = TRUE;
For example:
COPY INTO @parquet_unload_gcs_stage/region
FROM SF01.TEST01.REGION
HEADER=TRUE;
For the minimal fields we need, you can refer to Data Models. There are various parameters for COPY command which can limit the partitions and provide optimizations: https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html

6. Automate the Unloading

What we recommend is to automate the COPY command that unloads each day’s events (single partition) into the external stage by using Snowflake TASK for it to run at a schedule: https://docs.snowflake.com/en/sql-reference/sql/create-task.html An example of same would be:
CREATE TASK copy_to_gcs
SCHEDULE = 'USING CRON 0 10 * * * Europe/Rome'
COMMENT = 'Test Schedule' AS
COPY INTO @parquet_unload_gcs_stage/region
from SF01.TEST01.REGION
HEADER=TRUE;
If you use Airflow on your side for orchestration you can also use that to schedule this query at a regular interval. Alternatively we can fetch directly from Snowflake

Snowflake Direct Access Setup

Create a dedicated Aampe user with read-only access to your data warehouse. Once configured, we’ll handle the ETL scheduling on our side.

Step 1: Switch to Account Admin

Run this in your Snowflake worksheet:
USE ROLE ACCOUNTADMIN;

Step 2: Create Role and User

Create a read-only role for Aampe (skip if you have an existing read-only role you’d prefer to use):
CREATE ROLE AAMPE_ROLE;

CREATE USER AAMPE_DEVELOPER
  PASSWORD             = 'aampe_developer'     
  EMAIL                = '[email protected]'
  MUST_CHANGE_PASSWORD = FALSE
  DEFAULT_ROLE         = AAMPE_ROLE
  DEFAULT_WAREHOUSE    = <YOUR_WAREHOUSE>;

GRANT ROLE AAMPE_ROLE TO USER AAMPE_DEVELOPER;

Step 3: Grant Read Access

Grant the role access to your warehouse, database, and tables. Adjust the scope as needed:
USE ROLE ACCOUNTADMIN;

-- Current objects
GRANT USAGE ON WAREHOUSE <MY_WH> TO ROLE AAMPE_ROLE;
GRANT USAGE ON DATABASE <MY_DB> TO ROLE AAMPE_ROLE;
GRANT USAGE ON ALL SCHEMAS IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;
GRANT SELECT ON ALL TABLES IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;
GRANT SELECT ON ALL VIEWS IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;

-- Future objects (so new tables are automatically accessible)
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;
GRANT SELECT ON FUTURE TABLES IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;
GRANT SELECT ON FUTURE VIEWS IN DATABASE <MY_DB> TO ROLE AAMPE_ROLE;

Step 4: Share Connection Details

Send us the following:
  • Account (from your URL, e.g., tzviibq-hr75240 from https://tzviibq-hr75240.snowflakecomputing.com)
  • Warehouse Name
  • Database Name
  • Schema Name
  • Table Names (tables containing your event data)
Once we receive these, we’ll configure the connection and begin syncing data at regular intervals. References: