Your Google Analytics data lake options

Vendors

Method

Implementation time

implementation cost

Hardware, hosting and costs

Data granularity

Data assortment

Data attainability

Data availability

Continuous data precision (deviation from GA)

Historic data precision (deviation from GA)

Taxonomy (how raw data is interpreted into hits, sessions, users, browsers, traffics sources, campaigns etc.) 

Parallel GA tracking

Segment

Clickstream

OWOX

etc.

A script is placed alongside the GA script, making best effort to collect about the same clickstream of data

Typically 1 day to 2 weeks depending of available hardware and skills

500 – 10.000 USD depending on cost of resources

Depends heavily on vendor. Real-time tracking server and parsing server and storage is necessary.
USD 100 to USD 3000 monthly

Often hit-level, sometimes session-level

Typically ranging from a few select dimensions and metrics to 50 different dimensions and metrics

From implementation point and forward. Pause of license will cause loss of data

Real-time or daily

Depends on vendor and implementation.

Hits: 5-20%

Sessions: 3-15%

Users: 2-10%

Data not available prior to script implementation

Depends heavily on vender and their ability to interpret data and resolve sessions, users, browsers, campaigns etc.

Scitylana

Scitylana

No additional script

Data are only collected by GA. Hit-level data are compiled through numerous calls to the GA API

None

Data available within 30 minutes

None

Data copied as is

GA collects, parse, and store data as usually at no cost

Data transferred directly from GA to storage

USD 349 monthly

Always hit-level 

All dimension and metrics available as found in GA with very few exceptions. This is some 300 dimensions and metrics.

 

Any data may be attained. Both back in time and forward. If license is paused, data is still collected by GA and thus not lost

Forward data are delivered on a daily basis. Historic data will begin to appear few minutes upon license activation

Being an exact copy, there is normally no deviations

Hits: 0%

Session: 0%

Users: 0% 

Hits: 0%

Session: 0%

Users: 0%

Being an exact copy of GA, the taxonomy of GA is followed strictly. 

There will be no differences found to the GA interpretation of data.

// Type writer effect