LiveRamp

Store your event and dispatch data in LiveRamp through S3 integration

LiveRamp Data Warehouse Integration

The LiveRamp Data Warehouse integration provides a powerful solution for secure data collaboration and analytics. We utilize S3 as an intermediary storage layer, allowing you to easily import your event, dispatch, and visitor data into LiveRamp's cleanroom environment.

How the Integration Works

  • Daily S3 Deposits: Events and dispatches are automatically deposited into your S3 bucket on a daily basis
  • Visitor Updates: Visitor data contains records where the last seen timestamp is greater than or equal to yesterday, requiring upsert processing
  • Complete Data: All events and dispatches from your account are included in the deposits
  • Secure Transfer: Data is transferred securely to LiveRamp's environment
  • Cleanroom Analytics: Enable privacy-safe data collaboration and analysis

Data Organization

The data in your S3 bucket is organized in a partitioned structure:

s3://your-bucket/
  ├── events/
  │   └── YYYY/
  │       └── MM/
  │           └── DD/
  │               └── *.parquet
  ├── dispatches/
  │   └── YYYY/
  │       └── MM/
  │           └── DD/
  │               └── *.parquet
  └── visitors/
      └── YYYY/
          └── MM/
              └── DD/
                  └── *.parquet

This partitioning by year/month/day makes it easy to:

  • Query specific time periods efficiently
  • Manage data retention policies
  • Process historical data in batches
  • Use partition projections for optimized querying

Data Processing Considerations

Events and Dispatches

Events and dispatches are complete daily snapshots containing all data for that day. Each day's parquet files contain all events and dispatches that occurred on that specific date.

Visitors

Visitor data contains records that have been recently updated. This means you'll need to implement an upsert process to merge this incremental data into your data lake, warehouse, or database:

  1. Read the parquet files from the visitors directory for the current day
  2. Identify existing records in your target system using visitor identifiers
  3. Update existing records with new information from the parquet files
  4. Insert new records for visitors that don't exist in your system
  5. Handle conflicts based on your business logic (e.g., latest timestamp wins)

This incremental approach ensures you have the most up-to-date visitor information while maintaining data consistency across your analytics infrastructure.

Getting Started

To set up the LiveRamp Data Warehouse integration:

  1. Contact your account manager to enable the integration and provide you with the required permissions for your S3 bucket
  2. Configure your S3 bucket to receive the daily data deposits
  3. Set up LiveRamp access to your S3 bucket using one of the methods below

Once configured, your event, dispatch, and visitor data will be automatically deposited into your S3 bucket daily, ready for import into LiveRamp's cleanroom environment. Remember to implement the appropriate upsert logic for visitor data to maintain data consistency in your target systems.

Setting Up LiveRamp Access to S3

LiveRamp provides two methods to access your S3 data:

  1. Authorize LiveRamp's User (Recommended)

    • Most secure method
    • No credential sharing required
    • Uses LiveRamp's existing AWS account
  2. Create an IAM User

    • Create a dedicated IAM user in your AWS account
    • Share credentials securely with LiveRamp
    • More control over access permissions

For detailed setup instructions, refer to LiveRamp's S3 documentation.

Cleanroom Capabilities

LiveRamp's cleanroom environment enables powerful data collaboration while maintaining privacy:

  • Secure Data Matching: Match your data with other datasets without exposing raw data
  • Privacy-Safe Analytics: Run analyses across multiple data sources
  • Audience Insights: Gain deeper understanding of your customers
  • Measurement: Measure campaign effectiveness across platforms
  • Activation: Activate your data across multiple channels