Collect: Data Collection
Overview of the Collect phase in the CORE framework
Collect: Data Collection
Section titled “Collect: Data Collection”Data collection is the foundation of any analytics strategy. This phase covers best practices for collecting high-quality data that powers your analytics and decision-making.
Overview
Section titled “Overview”The Collect phase focuses on gathering reliable, well-structured data from all relevant sources. A well-designed collection strategy ensures consistency, reliability, and privacy compliance across your entire analytics system.
Key Concepts
Section titled “Key Concepts”Schema Design
Section titled “Schema Design”A well-designed schema ensures consistency and reliability across your data collection:
- Event naming conventions: Use clear, descriptive names (e.g.,
purchase_completednotevt_123) - Parameter standardization: Define required and optional parameters upfront
- Versioning strategy: Plan for schema evolution as your needs grow
Server-Side Tagging
Section titled “Server-Side Tagging”Server-side tagging provides several advantages:
- Reduced client-side load: Move processing to your servers
- Enhanced privacy: Better control over data collection
- Improved reliability: Less dependency on client-side JavaScript
Outcomes
Section titled “Outcomes”- Event schema defined and documented
- Tracking plan created and approved
- Server-side tagging infrastructure deployed
- Consent management integrated
- Data quality validation in place
- Privacy and compliance requirements met
Artifacts
Section titled “Artifacts”- Tracking Plan: Document outlining all events, parameters, and data collection requirements
- Event Schema: Structured definition of events and their properties
- Tagging Implementation: Server-side or client-side tagging setup
- Consent Management: Integration with consent management platform
- Data Validation Rules: Automated checks to ensure data quality
Pitfalls
Section titled “Pitfalls”- Over-collection: Collecting too much data without clear purpose leads to noise and privacy concerns
- Inconsistent naming: Poor event naming makes analysis difficult and error-prone
- Ignoring privacy: Failing to implement consent management can lead to compliance issues
- Client-side only: Relying solely on client-side tracking creates reliability and privacy risks
- No validation: Missing data quality checks allows bad data to pollute your analytics