Skip to main content

User stitching

To combine website and mobile app data, unifying customer interaction data across different platforms, or generally tracking a user's journey through various touchpoints, user stitching is required. It is a technique to combine data from multiple sources or sessions to create a unified view of a user. This is especially useful in scenarios where a user interacts with a system across multiple devices or platforms, and the goal is to consolidate these interactions into a single, coherent user profile.

User identification

User identification is the process of recognizing the identity of a user. This typically involves associating a user with a mostly unique identifier. There are three levels of user identification in walkerOS:

  • id: Represents a unique identifier for a user, typically drawn from internal systems like a CRM.
  • device: Used as a device-specific identifier, saved in the storage.
  • session: Refers to a session-specific identifier to track user activities within a single session.

One single user can have different devices with multiple sessions. The id is used to distinctly identify an individual user across various sessions and devices. The device helps to recognize a user across multiple sessions on the same device, providing continuity in tracking user interactions over time, while session is for understanding user behavior and interactions in a confined timeframe.

The diagram represents one user with two devices across three sessions.

info

These techniques may require a users consent and documentation to respect a visitors privacy and stay compliant. This guide is not a legal advice.

Event log

Events are individually measured user interactions that needs to be stitched together. Beside all event-related data are additional information to help identifying a user. This is neccessary to create user journeys.

The following table represents a minified event log over several days. It shows how user activities can be tracked over time, across different sessions, with varying levels of user identification and consent categories. A group id gets generated with each run, representing kind of a page load. A hash is a temporary fingerprint. session, device, and user are all simplyfied unique identifiers.

StepDayEventConsentGroupHashSessionDeviceUser
1Monday,
1st
session startanalyticsafoo
2Monday,
1st
page viewanalyticsafoo
3Monday,
1st
consent denyanalyticsafoo
4Monday,
1st
page viewanalyticsbfoo
5Tuesday,
2nd
session startmarketingcbar1phone
6Tuesday,
2nd
page viewmarketingcbar1phone
7Tuesday,
2nd
consent acceptmarketingcbar1phone
8Tuesday,
2nd
session startmarketingdbar2phone
9Tuesday,
2nd
page viewmarketingdbar2phone
11Wednesday,
3rd
session startmarketingebaz3phoneolli
12Wednesday,
3rd
page viewmarketingebaz3phoneolli
13Wednesday,
3rd
user loginmarketingebaz3phoneolli
14Friday,
5th
session startanalyticsfqux
15Friday,
5th
page viewanalyticsfqux
16Friday,
5th
page viewanalyticsgquxolli
17Friday,
5th
consent denyanalyticsgquxolli
18Friday,
5th
user loginanalyticsgquxolli
19Thursday,
11th
session startmarketinghlol4ph0n3

Based on the data, we can at least summarize the following:

  • 19 tracked events in total
  • 8 different group ids ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h')
  • 6 individual session start events
  • 5 temporary hashes ('foo', 'bar', 'baz', 'qux', 'lol')
  • 4 reliable sessions (1, 2, 3, 4)
  • 2 known devices ('phone' and 'ph0n3')
  • 1 unique user ('olli')

These statistics illustrate the complexity and richness of user data captured through user stitching.

Common pitfalls

In general, when working with a users behavior data, there's always a high chance of incomplete data. Especially in such a volatile environment like the web, where users can easily delete cookies, change devices, or simply use a different browser. Understanding user identification to stitch correctly is key.

Identifiers are changing frequently. While it's easy to identify a logged-in user with a unique id, it's more difficult to identify anonymous users. Knowing when to use which level of identification is crucial. As well as knowing the pros and cons of each level.

Working with persistent identifiers requires a users consent. Consent is always bound to specific puprpouses, which have to be documentated and communicated properly. may raise privacy concerns, especially if not respected, anonymized or managed.

Relying on third-party tools and cookies might have been a common practice with benefits in the past. Support for third-party cookies is phasing out, and there always has been and will be a fully dependence on others if data isn't owned and collected first-party.

Summary

User ids help to stitch together a users journey across devices and sessions. More importantly, using own ids helps to master third-party tools by syncing them without relying on third-party cookies. It's fundamental for attribtion moddeling, creating a holistic view of your users and gaining data owenership.