Feature Success Team

Feature Success Team

See what we're building
Annika and Neil in Aruba

People

What we're building

  • Feature Success Analysis

    Bringing together different parts of PostHog (flags, replay, surveys) to allow users to better analyse the success of a new f...

    Project updates

    No updates yet. Engineers are currently hard at work, so check back soon!

  • Users & recordings linked to feature flags

    We want to make it easier for those who use feature flags to get information on users attached to a particular feature flag, ...

    Project updates

    No updates yet. Engineers are currently hard at work, so check back soon!

Roadmap

Recently shipped

Customizable feature flag timeouts added

Following an incident earlier this month which briefly took down feature flags, we've added a new customizable timeout option for our Go, Ruby, Python, JS, React, and Node SDKs.

By default, flags will now timeout after 3 seconds, except for React-native which timesout at 10 seconds.

You can check the docs for info on how to customize timeout configs if needed, but we think the defaults will suffice and help prevent future incidents.

Goals

As always, reliability is the #1 unwritten goal: Making sure feature flags are reliable trumps every other objective.

Objective: Make sure feature flags can handle 10x current scale

Last quarter we hit some scaling limits on flags: It's now very expensive to run flags on Django and we'll hit some scaling limits at 5x our scale.

To get ahead of this problem, we'll rewrite our flags service to be more performant and reliable.

Objective: Polish new experiments UI & collect feedback

Last quarter we finished a basic version of our new experiment UI. This quarter, we want to polish this up, collect feedback from users, and address any issues that come up.

Broadly, we should:

  1. Make it easy for people to set up experiments and understand the results, with clear action items on when to end and what to do after ending experiments.
  2. At every stage of the experiment, tell people what to do next
  3. Have common sense boundaries, and better running time predictions, and more resilient significance calculations (flip-flopping problem)
  4. Make sure support for experiments goes down as a result of the above^

Objective: Add most requested surveys functionality

We will, in order of priority build:

  • Branching logic with multiple questions
  • Ability to duplicate surveys
  • Customise how many questions show up at a time

Handbook

Values

  • Fast, iterative and high output rather not slow and thoughtful - achieving this
  • Feedback-driven not spec-driven - we do a decent job at this
  • Missionary (we have a clear problem definition and are aligned on how impactful a solution would be) not mercenary - glimpses of this
  • Collaborative not lone wolf - glimpses of this

Personas

Company Persona

  • Primary
    • Size:
      • 20-75 employees
    • Stage:
      • Post-PMF
      • Series A-D
    • Customer type:
      • B2B/B2C/(B2B2C)
    • High expectation traits:
      • Use the modern data stack
      • Frontend uses typescript and react
      • High-growth
  • Not:
    • API companies
    • Shopify stores/no-code companies

User Persona

  • Primary
    • Role
      • Product-minded front-end engineer
      • Growth engineer
    • Seniority
      • Decision-making seat on product
      • Senior engineer
      • IC
    • High expectation traits
      • Reads HackerNews
      • Educated about the other feature flagging/experimentation tools in the space
      • Needs high-reliability and high-performance
      • Uses best-in-class tools such as Linear/Figma
  • Secondary:
    • Role:
      • Product Manager
  • Not:
    • Role:
      • Backend engineer
      • Marketing

Jobs to be done

Feature flags

  • Primary
    • Safely rollout frontend features with the least risk
  • Secondary
    • Persistent feature flags e.g. country/pay gate
    • Build/test in production
    • Enable beta users to try out experimental features ahead of time

Experimentation

  • Primary
    • Test whether a particular feature achieves the desired change in user behavior

Feature ownership

You can find out more about the features we own here

Long term vision

Imagine Bob is a product manager, and Alice is an engineer, both of whom love using PostHog.

During their weekly growth review, Posthog shows them that one of their workflows is performing 50% worse than other SaaS companies with a similar flow. They decide to build a new feature together, but they're unsure of the impact, so Bob & Alice decide to gate the feature via a feature flag.

Alice builds the feature and runs the PostHog CLI, automatically converting his feature branch to a feature-flagged version. During creation, he selects the team template they normally use, called "Autorollout based on conversion metric", using the conversion metric that Posthog suggests. The feature progressively rolls out to internal users, then to beta users, then to remaining users. If their conversion metric falls by more than 20% the feature automatically rolls back and alerts their team. Alice requests a feature flag review from Bob.

Bob checks the Posthog UI and because it's such an important feature - adds a safety condition for Sentry errors increasing by 30% and a few counter metrics. This should result in an automatic rollback as well. Bob starts the experiment.

Thankfully, nothing goes wrong when the feature is rolled out. The team is disappointed that the feature doesn't seem to move any of the core company metrics, however. This doesn't fit into either of Alice's or Bob's model, so they dig deeper why this was the case.

Before they even start, PostHog automatically does some impact analysis on their core metrics, and generates some insights into what properties are highly correlated with conversion & which aren't.

As it turns out, people in USA and India love their new feature and show a 40% increase in conversion. Other countries, especially the UK, seem to dislike it so much that it negatively affects conversion. In the end, these forces balance out, leading to similar total conversion rates.

They suspect it might have something to do with their positioning in other countries, so they run a marketing experiment using PostHog, where PostHog automatically generates recommended copy text to try out. It generates 5 variants, and they test these in all countries.

As it turns out, copy wasn't the issue, and there's no significant change here. They watch a few recordings from the experiment to confirm there's nothing off here.

Since it's not a positioning issue, Bob & Alice decide that it makes sense to introduce some personalisation, and let people opt-in to the new feature, and have it on by default for USA and India. They can customise this right from the feature flag, and set this up such that any users who opt-in on their UI automatically get the flag.

PostHog keeps analysing metrics for this flag over time, and notifies Bob and Alice when their customers behaviour change. For example, if the conversion for users in UK has taken a turn for the better, or if enterprise customers have taken a turn for the worse.

Our long term vision is to make all of this possible.