Monetization Mastery: The pricing and packaging plays that helped fuel the company’s growth from $160 Million to ~$1 Billion in ARR – A case study

(~40 minute read)

—

The Plays:

Play 1 – $160M ARR – Implementing a more disciplined volume and term-based discounting framework for better price execution

Play 2 – $200M ARR – Establishing an adoption packaging program based on experimentation to place value creation before value capture and exploit the benefits of viral adoption

Play 3 – $450M ARR – Creating standardized burstable ELAs

Play 4 – $600M ARR – Reimagining the monetization model for the cloud platform: Transitioning from a subscription to a hybrid subscription + consumption pricing model

—

Before jumping into this post, I’d like to acknowledge a few folks at Alteryx for their instrumental role across these four plays and around all things pricing and packaging at the company. These plays would not have been possible without them.

Bruce Strainer (Former VP of Data Science) – Amongst the smartest and most analytically driven people I’ve ever met and a true player-coach with a work ethic second to none. Bruce was my partner in crime and did much of the heavy lifting around the quantitative and statistical modeling we would undertake to design these plays
Tim Olaerts (VP Revenue) & Kevin Morihisa (Sr Dir, Revenue & Deal Desk) – These two were the amazing operators on the front lines who took our craziest ideas, refined them and spearheaded their operationalization while ensuring we stayed compliant with all Rev Rec rules
Kevin Rubin (Former CFO) – Kevin was a true thought partner who somehow always managed to provide a rare nugget of input that would unlock that key missing piece of the puzzle. What began as a random whiteboarding session in his office over a brief lunch break ultimately evolved into Alteryx’s burstable Enterprise License Agreement (ELA) framework that was front and center for investors in how they evaluated the company’s performance
Suresh Vittal (CPO) / Paula Hansen (CRO) – These two played a key role in the operationalizing of ELAs and in driving alignment around the new monetization model for cloud

–

During my tenure at Alteryx, my portfolio of responsibilities included a variety of strategic initiatives including the oversight of our pricing strategy and execution function. One of the more intellectually stimulating aspects of my role during this time involved serving as the primary architect of our pricing strategy (up until ~$850M in ARR) which was an expertise I had honed in my previous life. (Fun fact: I began my career working with the great Tom Nagle and have authored multiple pricing case studies, with author credit, in his bestselling pricing book that is used at all of the top-five U.S. MBA programs). During this period at Alteryx, we embarked on a series of strategic monetization initiatives that played an important role in helping the business scale from $160M to ~$1B in ARR. This post explores the details behind these decisions and outlines the strategic rationale, analytical frameworks and the impact of those decisions in a case study format.

Note: While this document covers the detailed analysis frameworks we used for decision-making, the actual data presented has been simulated, jittered, or de-identified to preserve confidentiality. The only real data points referenced here are those that Alteryx discloses publicly in SEC filings or earnings calls.

For context, let’s start with a brief overview of Alteryx and its product suite back when we first began this effort. In summary, Alteryx is a horizontal self-service enterprise analytics software platform with a focus on enabling line-of-business knowledge workers to transform data into breakthroughs without needing to possess advanced technical skills. These can be insights or the automation of analytic processes that drive tremendous efficiency in mission-critical tasks (e.g. back office finance functions, marketing campaigns, or supply chain efficiency). The suite of products in the Alteryx platform included a set of capabilities to allow knowledge workers to discover a broad array of data assets, connect to them, build sophisticated analytical pipelines including descriptive and predictive models in a code-free (or code friendly) manner, schedule these pipelines as self-service interactive apps or unattended data workflows and finally deploy machine learning models as rest API endpoints for other services to consume. A key strategic principle for the platform was openness and extensibility so while it offered capabilities that were fairly broad, the platform also interfaced well with other tools. As a result, it wasn’t unusual for customers to deploy the platform as part of a broader stack that could include a best-in-breed data cataloging product (e.g. Alation) or a dedicated visualization tool (like Tableau) or an RPA platform (like UI Path). The platform was fundamentally agnostic to the initial data persistence layer and the end consumption layer and integrated with all the major players at these endpoints.

While the product portfolio today is broad and includes a set of on-prem and cloud products, the platform capabilities in 2018 entailed four core products with simplified descriptions as follows:

Alteryx Connect: A metadata catalog that serves as a social data discovery platform for knowledge workers of all skill levels
Alteryx Designer: The flagship offering that allows users to design descriptive, predictive and spatial data pipelines using code-free methods or code-friendly methods (through integrations with SQL, R and Python). This product comes in several different packaged offerings that include differences in bundled datasets, spatial tools and advanced machine learning and NLP functionality. If you view the world through the lens of a data scientist, the simplest way to think of Designer is a visual, code-free abstracted equivalent of Python Pandas, R or procedural SQL. If you identify more as an Excel user, think of Designer as eliminating the need to do lookups, pivot tables, sorts, filters, etc. and build models hundreds of columns wide and millions of rows long at the most basic level and then keep feeding it new data and having it auto update with a single click. The magic of the product comes from enabling users with an Excel skill set to achieve analytical outcomes that were previously only possible with the knowledge of Python, R or advanced SQL in a rapid timeframe.
Alteryx Server: A centralized server product capable of being deployed on-prem or in a VPC environment that allows Designer users to schedule their workflows as analytic apps or unattended workflows to run in a scalable and recurring manner. It includes a centralized browser-based UI called private gallery where these assets reside and has different interfaces for admins vs. end consumers. Knowledge workers (e.g. a line of business manager) can consume the apps published to gallery and re-run them with different (parametrized) inputs to get contextualized answers
Alteryx Promote: Allows citizen and professional data scientists to deploy a no-code or R/Python based machine learning model as a REST API endpoint so it can be embedded in other business applications (e.g. Salesforce) without requiring it to be re-coded from scratch by IT in the native languages of those other applications (e.g. Java)

All products were licensed on a yearly subscription basis and Alteryx Designer and Server were the flagship products in the portfolio. Alteryx Designer was licensed under a named user model whereas the other products were priced under a core-based model. The broader business at ~$150M in ARR operated largely under a land and expand strategy where new accounts were landed with a handful of seats to service one use-case in a sales cycle that spanned a few weeks to a few months. Over time, with the help of customer success, we grow our footprint in those accounts and spread to different departments, use-cases and personas making net expansion the fundamental bedrock of our growth strategy. The largest customers tended to standardize enterprise-wide on the Alteryx platform and had thousands of active license subscriptions along with a full platform deployment.

—

Play 1 – $160M ARR – Aligning price to value and fixing our discounting problem through a more coherent discounting framework

Alteryx IPO’d in Jan 2017 and reported full-year revenue of ~$135M for 2017 (75% YoY growth). Shortly thereafter (mid-2018) is when we launched the first strategic monetization play. The GTM engine was performing well on the surface, however, the sales team was highly decentralized and had broad decision rights and leeway to structure deals. The end result was a bit of the “wild-west” dynamic to deal structuring. The team had pushed through some really creatively structured deals that were not atypical for a small (~$50M ARR ) company but were more questionable for a company ~3x that size (which we had grown to). Ironically, we hadn’t applied the same level of analytical rigor to this part of our business despite pricing analytics being a core use-case for the Alteryx platform. The CEO therefore made getting a certain percent Average Selling Price (ASP) lift a top-3 companywide strategic OKR for the year along with implementing a more coherent discounting model and deal-desk approval process. Furthermore, we had the pressure of an external compelling event. We were made to transition our accounting practices by the SEC from ASC505 to ASC606 and one of the first order consequences of that transition was needing to maintain SSP and VSOE in tighter bands that our baseline didn’t adhere to. Said differently, we had to get discounting streamlined and under control.

An analysis like this involves looking at effective prices and discounts across a key set of value drivers and ensuring that 1) The trendlines are coherent and align with what one would expect and 2) the variation within those trendlines is acceptable so as to maintain fairness across similar customers and adhere to the pricing rules set by ASC606. Ultimately, what one needs to ensure is that a given customer’s total Annual Contract Value (ACV) is more a function of their actual contracted SKUs and contract terms and less a function of their ability to negotiate or because of a reps willingness to give in at the end of a tight quarter. The first step then is itemizing the set of value drivers on the basis of which price & discount should vary. For Alteryx these were:

Volume: Specifically, the number of seats for our named user products and cores for our Server based products. These were the building blocks of value via which a customer scales the deployment of our platform in their enterprise and the basis on which we scale our value capture as well. Per traditional economic theory and value-based pricing approaches, the unit price per user or core should be inversely related to volume in a disciplined price execution environment
Contract duration: This is the contract term which for Alteryx tended to be 1 or 3 years though we typically invoiced for one year (and no less than that). Longer terms were beneficial to the company as they reduce the risk of churn and lock out competitive entrants that may try to use asymmetric acquisition tactics. Contract duration therefore merits a price incentive. Contract duration also impacts GAAP revenue in ASC606 since the model recognizes a percent of total contract value upfront
Channel: These represented the GTM partners we leverage in the field that included Value Added Resellers (VARs), referral partners, SIs etc. To align incentives, Partners that serve as an extension of our direct sellers received either margin per unit or referral fees depending on the type of partner they were and the amount of business they do with us
Other contractual terms: These were other terms that had some economic value to us and include items such as payment terms (these impact working capital), period in quarter when deals are booked (upfront is better for us to maintain healthy linearity vs. having everything book at the end of the quarter), upfront multi-year billings (can significantly impact overall business cash flow)

The analysis we had to do then, required looking at these factors across our entire portfolio of customers to understand the current state.

An important factor to note here is that like most companies, we used ‘Enterprise License Agreements’ to structure large deals with some of our most valuable customers. There was no universal definition for these deals other than these were bespoke deals that we structured for a given customer that took into account their unique situation and often had some element of ‘unlimited’ usage. In our systems, these got captured as a single all-encompassing line item with a single contract ACV making unit price and discount calculations on these deals impossible. This was a big problem because in our customer Pareto distribution, many of our 20% customers fell into this cohort. In order to do the analysis correctly then, we had to manually disaggregate the content of these customer contracts. This involved reading the actual contracts and looking for clauses that listed some level of max license usage caps, and where that didn’t exist, using telemetry and licensing data to see how many licenses of various products these customers had actually activated over a given period of time to estimate their effective ‘unit price’. Luckily we had an awesome team of analysts to help with this and automate much of it using our own analytics platform.

The chart below is a simulated illustration of the output of this analysis. Note: the data is de-identified to show what insights one can surface from this analysis without disclosing proprietary company data.

The figure on the left shows account level ACV against discount rates for active accounts and the figure on the right shows discount distributions across ACV buckets for the 15 deciles of active accounts.

Given that this is a fairly complex visual, let’s break down what specifically we were looking for:

Are the medians across deciles on the figure on the right consistently sloping upward or are there unnatural disconnects at various points where the gradient of the curve changes? If there are disconnects, they need to be investigated to determine if they are indicative of an inflection point in customer willingness to pay or whether they are internal mental goalposts around which sellers tend to rapidly step up discounting
Are the interquartile ranges across the box plots relatively tight and sufficiently separated from one bucket to another or are they all overlapping? if they are fairly wide and largely overlapping from one group to the next, it suggests that there is low coherence to the volume-based discounting strategy at the company which needs to be fixed with better discipline in price execution
For each boxplot group, is there sufficient separation between the medians of single-year vs. multi-year deals? The magnitude of this difference should be equivalent to the economic expected value of the multi-year contract. If it’s too large (or too small), the contract term is getting disproportionally incented in the pricing model
How high is the median discount at the upper groups? Is it below the max discounted price you would expect to be a “floor” for value capture? If it’s effectively going to zero (95%+ discount) you will likely have an asymptote (and therefore ceiling) in the “expansion” part of an account level land-and expand model and will need to adjust TAM expectations for your offering accordingly particularly if your product is targeted to the top end of the market (Global 2000 orgs)

What we learned

After several iterations of refining the analysis above, we had a couple of key insights that became very clear to us:

Our discounting practices were very inconsistent. The correlation between price and volume was much weaker than we expected with a heavy overlap in interquartile discount ranges across account-level ARR deciles. When we ran a set of statistical models to determine what factors predicted discounts, the features that bubbled up were not the ones we wanted to see (rep/manager, time of quarter etc.). For a company of our size, it became clear that we needed to reign this in
The median discounts by band had unnatural inflections at certain ARR thresholds for different business segments. These were particularly pronounced for older sellers and sales managers that tended to assume the presence of a “market ceiling” on contractual ACV in accounts once deals exceeded a certain size. In other words, they assumed ‘unlimited’ deals were the norm at a certain size causing prices to compress aggressively
Discounts for contract duration were much greater and more variable than we expected particularly for large deals
Bespoke deal structures (ELAs) were being used to structure deals to obfuscate unit price dynamics at deal sizes that didn’t warrant using them

It became abundantly clear to us that we needed to introduce a formal discounting matrix to the business, revamp the deal-desk approval process and possibly adjust seller comp plans. We went about architecting the new model with the following design parameters

Primary discounting metric: The primary decision needed here was whether to build the matrix based on gross ACV or # of seats as the primary metric. Both have their pros and cons. Gross ACV can normalize across differently-priced products in the portfolio. The benefit of seats (which was the price metric for the flagship product but didn’t apply to all products) was that it was easier for customers to understand since that is how they tended to purchase (e.g. it is easier to tell customers that if you go from 20 to 40 seats your price per seat will be X vs. say if you go from $20K to $60K in gross ACV, your discount and net ACV will be X and have them impute the seat price). For this reason, we decided to opt for seats as the primary metric
Cumulative vs. incremental as the discount baseline: In a land and expand model, this one can be tricky and is best illustrated with an example. If a customer has 10 seats and wants to buy 5 more, should the discount eligibility be calculated on a 5-seat deal or 15-seat deal. A 5-seat deal would certainly be more financially beneficial to the company and would incent customers to buy in larger volumes (vs. incrementally) vs. giving them the benefit of the full 15 licenses for the price discount of the incremental purchase. This is one where we did some experimentation and landed on the cumulative model because the incremental-only approach became a major point of customer dissatisfaction for the smaller customers that wanted to buy seat licenses incrementally as their teams and use-cases scaled (vs. buying a lot upfront). The other benefit of this model is that it actually is more beneficial (lower TCO) for the customers on the margins to purchase greater licenses and move up the volume curve, which creates a nice incentive for expansion
Discount level: Setting the actual max discount level for deal bands was a key design decision and is tricky when the baseline current-state data shows big distributional spreads. While conjoint analysis can help here, getting this right requires some experimentation. We adopted an approach where we targeted the 70^th percentile of realized prices for mid-size deployments as the net price target and worked from there to establish thresholds for smaller deals and a coherent discount curve for larger deployments (targeting roughly the 70^th percentile there as well).
Term discount: To calculate term discounts, we focused on churn rates to estimate what the expected annual benefit would be if we locked in someone to a 3-year deal (as we would minimize churn over that period). This expected value became the incremental multi-year term discount percentage
Other factors: There can be lots of other ancillary deal terms that can impact price (e.g. payment terms, contract start date, swapping rights etc.) that have the potential to add a ton of complexity to the discounting model. Our philosophy here was to start simple and layer in complexity over time. As a result, instead of being formulaic with all of these “other” factors, we issued guidelines for how to think about these during negotiations and have the deal-desk help with the details on a deal-by-deal basis
Incentives and Approvals: No matter how good a model is on paper, it ultimately comes down to how well it is implemented which is a function of incentives and approvals. Being too heavy-handed can add a ton of friction to the deal process while not pushing back enough can render the whole exercise moot. To drive initial implementation, we focused on creating incentives for the rep. If they came in anywhere above the max discount threshold (based on volume and term), the rep would get 20% of the value of the upside (difference between actual and max discount) as incremental upside. This was the threshold we felt was needed to incent sellers adequately. Additionally, we created a graduated approval matrix where based on size of deal, type of item discounted and the magnitude of the requested discount, there were sufficiently greater levels of approvals required. Discounts above the max on small deals could be approved by second-line managers, however big discounts on large deals or on variable cost items required CFO approval and were much more heavily scrutinized.

The final discounting matrix could be visualized as follows:

KPIs and Impact

Picking the right KPIs is critical to driving the appropriate outcomes. For us, these were:

Percent of deals closed within the approval bands
Wtd. avg discount vs. the target line
Net realized price for each band

KPI #1 is self-explanatory. When we implemented the model, we simulated it on historical data to assess what impact it would have if we had adhered to it. Our baseline adherence rate was about 60%. We set an 80% goal for year-1 and estimated the impact of that to be ~$14M net of incentive costs. KPI #2, the weighted average discount vs. target line is important for a business that had seasonality like ours, and was steadily increasing average deal size. For quarters like Q4, which tend to have larger deals than say a Q2, tracking raw wtd. avg. discounts wasn’t helpful because discounts will by definition be higher in Q4 vs. Q2/Q3 given larger deals. Larger average deal sizes will by definition result in larger average discounts. Instead, what matters is how actuals compared to the theoretical wtd. average discount permitted by the matrix. Closing this gap over time is what drives accretive value. Finally, net realized price by band was a metric we tracked to measure strategic price compression over time in the business and to ensure that our TAM estimates based on scaled unit pricing were indeed accurate and that there wasn’t any evidence of commoditization in the business.

While we tweaked the model over time, the ~$15M impact we achieved in year 1 alone by exceeding the 80% adherence target made this effort a great success.

Play 2 – $200M ARR – Establishing an adoption packaging program based on extensive experimentation

The primary sales motion at the company at this time (both for net-new logos and for expansion business) is what we would call “bottom up product assisted”. Oftentimes, end-users would respond to a campaign or get introduced to the product and end up on our free trial. Once the trial expired in 15 days, a customer had to engage with a rep to purchase the product. In addition to free trials, we also had a 30-day ‘evaluation license’ for slightly larger teams and more complex use-cases that required more time. Reps were authorized to extend this to 60 days based on certain criteria.

Our hypothesis was that once there was a base level of adoption and value derived from a handful of initial use-cases in an account, the product could potentially spread much faster if enabled correctly. Of course we could try and persuade a customer upfront to go from 5 seats to purchasing 50, but even with sophistical selling approaches that incorporated value engineering and ROI studies, some customers want to see “proof” before writing big checks. One way modern software vendors get around this friction is by adopting usage-based / consumption pricing (pay only for what is used), however, this was not feasible for Alteryx at this stage (more on this later). Instead, we needed something to simulate consumption-based pricing dynamics which is when we invented the adoption program. It worked as follows. A customer received a bundle of licenses and dedicated Customer Success resources at deeply discounted rates for several months. Given the low initial hurdle price, the licenses get distributed to a broad set of end-users who otherwise may not have qualified for the product based on perceived price vs. ROI. They deploy the entitlements and use them for a period of time on real use-cases. At the end of this period, once value was proven out and we had a bunch of happy users and champions, the customer was much more likely to write a big check for actual demonstrated value. Now, on the surface, this may seem like just another extended trial but the difference is in the details and the specific design parameters which we discovered through extensive and rapid experimentation. Let’s look at each of these design parameters.

Price: For trials and extended evals, price was 0 for the entire period. While we tried free adoptions spanning several months, what we found was that having some small price tag on these bundles was critical to ensuring that the customer had skin in the game and was engaged. We landed on a discount rate of high double digits for these programs based on experimentation.
Product Portfolio: Another key design decision is what set of products to include and in what quantities. Should ancillary products be included or not? Again this required experimentation and what we found was that different bundle sizes worked for different situations. That said, unlimited licenses were almost never accretive and neither was the strategy of including ancillary products upfront. Limiting the package to just the core flagship products produced better results as measured by conversion.
Duration: This was highly correlated with #2 (number and qty of products included) and customer maturity and varied considerably by situation. For a small customer with low maturity, 3 months was sufficient. For a larger customer wanting to scale to a much larger deployment, 9 months was critical.
Customer Success / Enablement: This was arguably the most critical component of the program. For a product like Alteryx that requires end-user adoption and alignment with mission-critical business processes, having post-sales resources to facilitate organic adoption alongside a customer champion on the ground was essential.

Ultimately, we landed on a construct that included 4 potential adoption packages that were successively larger and longer in duration: a small, medium, large and extra-large. These ranged from 90 to 180 days, 50 to 500 end-user licenses and were heavily discounted by high double digit percentages. All of them involved engagement with the customer success team that would run a very specific playbook based on the package.

The outcome of this program was very impressive. Not only were we able to scale accounts that were stuck on a handful of seats to several hundred thousand a year later via this program, we saw newer account cohorts grow faster as well. Overall, net dollar retention for this program, twelve months from the start of adoption was twice the rate of what it was for equivalent accounts (adjusted for size and duration) making it very accretive to the top line. Though the magnitude of this lift got smaller over the years, the adoption program continued to drive a meaningful impact and was a key arrow in the GTM teams’ arsenal right up to $1B in ARR

Play 3 – $450M ARR – Creating standardized burstable Enterprise Licensing Agreements

Though enterprise deals had been critical from the $50M ARR stage onwards, as we began to approach the $500M ARR milestone, large enterprise deals began to take on even more significance. Ensuring that these were structured correctly with the appropriate upside was critical. At this stage of growth, there was no unified definition of what constituted an “Enterprise Agreement”. Additionally, there were a couple of issues in particular that we were dealing with:

While we had a strong suite of products, there was lack of clarity on what the ideal up-sell and cross-sell motion should look like. What was the optimal sequence of products to purchase and what were the right ratios? Should all products be packaged together or should there be different good/better/best configurations? Reps were navigating this on an ad hoc basis
Many large customers until this point had managed to negotiate effectively unlimited license agreements for the flagship product around the $1M ADS point. Earlier stage reps that were used to smaller deals had inadvertently began to use this threshold as a ceiling on deal size and were giving away more than they needed to once a deal approached this scale. While this resulted in strong near-term growth (i.e. getting an account from $300 to $600K), it caused rapid deterioration in the net unit price of the products and artificially capped our potential in accounts that had a much larger theoretical TAM
The manner in which services (support, success, pro serv etc.) were included in deals was ad-hoc and highly variable. This created a lot of variation in the customer experience
Adoptions (see play #2) which had been used very effectively to move accounts from a handful of licenses to mid-size deals were being somewhat misapplied in these situations in attempts to drive incremental growth

These dynamics are illustrated in the figure below. (Note: actual data jittered for confidentiality)

As illustrated above, after the “G” seat threshold, we observed a large unit price decline and much higher realized price variation that we needed to control if we wanted to maintain predictable growth in our accounts.

In order to solve this with the creation of standardized ELA bundles, we approached the problem with a framework that included 6 design parameters and came up with guiding principles that informed the analysis and design elements for the final model.

Tier schematics: This included elements such as the # of tiers, primary tier metrics and the starting threshold of ELAs. The overarching guiding principle here was that we wanted the total number of tiers to manageable and easy to communicate. The primary metric needed to be easy to understand and ideally aligned with the a la carte pricing model. Additionally, the starting threshold for these enterprise bundles needed to be high enough to be truly ‘enterprise scale’.
Packaging and bundle composition: This included traditional packaging considerations such as what products to include and in what ratios and whether there should be different good/better/best flavors of tiers. Because the product suite wasn’t that extensive (5 products in total), we landed on a model where we had a single type of ELA for each size based tier that included 3 major product lines while the 2 ancillary products could be purchased as add-ons. The quantity ratios of the 3 product lines (Designer, Server, core addons and Intelligence Suite) were based on a combination of what we believed the right ratios of these products should be and what historical data suggested were the ratios customers actually purchased these products in. Different levels of support and services were included based on what we had learned was needed for successful enablement. This was another area where customers could purchase premium levels of services if they wanted to
Price levels: Once packaging was directionally established, we needed a methodology to calculate actual price for the bundle of products and a discounting methodology for the tiers. This was a key part of the original problem statement we had set out to solve (preventing rapid unit price deterioration at scale). The conceptual approach we used here was to target a percentile of realized price for different volume ranges (~70th percentile) up until we saw rapid unit price deterioration and then continue the curve downwards from that point at the same expected slope (vs. the steeper slope that we were currently experiencing). Obviously, the price levels had to be consistent and lower than what was possible under the a la carte structure and inherit the same considerations such as term discounts. Additionally, the model had a unique burst capacity component embedded in it which was its most defining feature (discussed in more detail below)
Operationalization: This dimension was all about how to implement these and included system considerations such as whether we should build these as hard or soft bundles, if line item pricing should be exposed, whether we should have different SKUs for base vs. burst capacity, how licenses should be delivered, and how we should deal with co-terming vs. contract replacements for instances where multiple contracts existed for the same customer. The guiding principles here were based on a combination of strategic considerations and factors that would drive a fast time to market.
Channel considerations: This was about whether and how we wanted channel partners (resellers /SIs) to leverage the bundles. Since these bundles involved significant price incentives, we couldn’t just have the ELAs inherit the traditional tiered channel discounts. Instead, we ended up having to adjust the reseller discount framework for when partners sold these to be less so than the a la carte partner discount model
Upselling mechanics: This dimension entailed building in structural trigger points that create compelling opportunities for a customer to expand and is detailed below

The most innovative feature of the ELAs was what we defined as bursting capacity. This feature involved taking a page out of our adoption playbook (play #3 in this document), and essentially cultivating demand by seeding product usage for future upsell through free temporary capacity that was incremental to what the customer paid for in the tier that they purchased. We knew from the success of the adoption program that the strategy accelerated net expansion when administered correctly. What we needed to better understand was how it should work at a different scale of deployment. This was essentially an analytical exercise to understand how much temporary excess seat capacity is optimal in an account to drive future expansion. If it was too high, then we would be creating a glut of license capacity in an account and creating a lack of scarcity. If it was too low, then we would be leaving potential upside on the table. To do so, we analyzed what levels of excess capacity resulted in higher net expansion at renewals and discovered that the number varied from 20% to 50% based on the scale of deployment.

Ultimately, the bursting framework worked as follows. Each ELA tier came with a certain amount of free excess capacity that the customer could “burst” into. It was typically 20%-40% incremental to what the customer was paying for based on their purchased tier. The burst was valid for up to 1-year from the time of deal (could be configured to be shorter if seller wanted). At the end of this 1-year period, if the customer had used this excess capacity, they would need to 1) Either pay for the incremental licenses on an a la carte basis, or 2) Move up to the next tier of ELA in which case they would unlock another 1-year burst license capacity and steadily move up the ARR curve. The way we calibrated the tiers was that if a customer were to utilize even half of the burst capacity, it would make more sense for them to move to the next tier of ELA than to true up those incremental licenses individually.

While the final ELA bundle table is confidential, the figure below illustrates how the unit price dynamics work for ELAs and how it co-existed with the a la carte model. The ELAs begin at seat range “E-F” and offer better unit pricing compared to the a la carte model from that point onwards with additional term-based incentives.

The figure below illustrates more detailed unit pricing dynamics for ELAs with the “burst” capacity and how it compares to a la carte pricing. Not only was the light orange dot (ELA unit price) lower than the a la carte price of the bundles (the green dot), the bursted price (dark orange dot) was significantly lower. This temporary incentive is what gave customers the feeling of having received even better pricing which if they used, would get trued up and result in strong net dollar retention for the company, resulting in a virtuous growth cycle and continued ARR growth & TAM capture at the account level. Driving effective utilization of this capacity was a key objective of our customer success teams.

KPIs and Measuring impact

Like all other monetization initiatives, having clear KPIs is key to driving desired outcomes. For the purposes of this exercise, we landed on three top-level KPIs to measure the success of ELAs. These were:

(Usage) Percent use-rate on eligible deals
(Growth) Net dollar retention
(Discounting) Avg. percent discount rate

The first of these is a penetration metric to help measure what percent of eligible deals does this contractual framework get adopted on by our sellers. Since it was only meant for larger deals, we knew adoption should be measured as a subset of those larger deals. Our first-year adoption target was 40% of all eligible deals. Unless the model got adopted by reps, we wouldn’t be able to drive its intended impact. The second KPI was the primary impact metric. This measured the net dollar retention (a core companywide KPI) on this cohort of accounts vs. other contractual vehicles. If this was higher than average, it meant that the ELAs were having their intended effect (driving disproportionate growth). The third metric was average discount rate. Part of what we intended to solve with these ELAs was to address high unit price compression at larger deal bands. In addition to driving higher dollar retention, we needed these vehicles to help improve discounting and control price compression.

By all accounts, the year-1 performance of these vehicles was a big success. Not only did we exceed the target adoption rate, we saw an almost 2x increase in net dollar retention on this cohort of deals and material improvements in unit price compression. The figure below illustrates net dollar retention growth for ELAs vs. our other contracting structures (a la carte deals and custom agreements) in year 1. Additionally, the ELAs ended up becoming one of the central conversation themes between management and Wall Street analysts / the broader investor community. Furthermore, after the first-year cohort reached the conclusion of their burst period, we saw additional upward expansion (via the bursting mechanics described above) on more than 70% of all the deals sold, a promising trend that would continue to drive the positive growth flywheel we envisioned.

Play 4 –$600M ARR – Reimagining the business model for the cloud platform: Evaluating a transition from Subscription to Consumption pricing

In 2021, Alteryx released its most significant new product since inception called Designer Cloud. This was a cloud-based (multi-tenant) version of Designer, the company’s flagship offering that up until then was only available as a desktop deployment with a satellite Server offering that was VPC-based. Designer Cloud was full re-platforming effort that involved redesigning and reimagining the product from the ground up to be cloud native and offer a host of net-new capabilities. In addition to the innovating on the technology, we set out to reimagine the business and commercial model of the offering as well.

As stated earlier, Designer desktop was a seat-based subscription model and Server (which allowed scheduling and scalability) was core-based. In the cloud, Server would cease to exist since a customer no longer needed to deploy their own infrastructure. This essentially put about a third of the company’s revenue stream at risk. At minimum, we would need to raise seat-based prices to offset the loss from Server. This was however, an opportunity to consider a more material change and consider alternate net-new models. One that was particularly interesting to us was a consumption (usage-based) model. We shortlisted three candidate models and set out to understand which of these made the most sense. The candidate models were:

Good / better / best – seat-based subscription model
A pure consumption model
Some type of hybrid model

We established a set of decision criteria and guiding principles to help inform the analysis needed. It entailed the following dimensions:

Maintain overall revenue equivalence (and ideally improve it)
1. Over time, we knew that most of the business would migrate to cloud so we needed to ensure the model would be accretive on overall revenue. Since we had a half a billion in baseline on-prem ARR, this meant that we couldn’t have a model that would result in <1x this number in aggregate (leaving aside the puts/takes at the account level) as that would cannibalize overall revenue
Allow faster expansion in enterprise segments
1. We wanted a model that would accelerate net dollar expansion in our largest potential accounts through more frictionless up-sell and cross-sell by better aligning value creation to value capture
Unit economics should be above a certain threshold (allow us to maintain our gross margins)
1. The on-prem business had ~90% gross margins. While we knew this number would go down since we would be hosting infrastructure as part of the managed service offering, we needed to maintain certain gross margin thresholds to meet the long-term model targets that we had committed to our investors. This meant that costs had to be a key consideration
Accelerate cloud adoption
1. Since cloud was a strategic imperative, we wanted a model that would accelerate cloud adoption (even if this was achieved through temporary offers)
Ease of implementation
1. The complexity of the model needed to be balanced with implementation considerations. We didn’t want a model that would be very difficult to implement from a technical standpoint
1. Also, in addition to technical complexity, we didn’t want the model to result in massive winners and losers at the customer level which would make roll-out difficult
Complements ELA bundles already in market
1. As outlined in play #3 in this document, we had rolled out Enterprise License Agreements (ELAs) that had become our most strategic enterprise-selling vehicle. We needed the model to complement these ELAs

Since Designer desktop was a single seat-based subscription, a good-better-best subscription model would involve packaging the product with a set of different feature combinations with varying price points. As Designer was considered a premium-priced product in the market, one question we had to tackle was whether it made sense to introduce a lower priced ‘Designer Lite’ version with limited feature functionality. While the packaging and pricing details and potential candidate combinations we considered were quite complex, in a nutshell, we needed to know whether the following was true:

(Price A * lite offering * mix of package 1) + (Price B * better offering * mix of package 2) + (Price C * best offering * mix of package 3) > (todays price * full offering)

The consumption model analysis boiled down to selecting a portfolio of candidate usage metrics and then analyzing what the unit bookings equivalence point would be on each usage metric by looking at usage patterns via telemetry data. The usage metric had to be acceptable to customers and the revenue equivalence point on that metric had to be something that we would need to believe we could exceed if we were to disaggregate the current subscription offering and put it at risk through a consumption model. The statistical distribution of usage on the given metric across the customer base really mattered. Said differently, we needed to ensure that:

(current net subscription price / average usage volume on metric) = price per unit usage (PPU). PPU needed to be acceptable to customers and have reasonable variability across users

After going through a series of design workshops, we landed on a set of design considerations that we tested through customer and prospect surveys to further refine before conducting more sophisticated Conjoint and Gabor-Granger studies to quantitatively model revenue impact under each scenario.

On the first model (subscription-based good/better/best packages), the conjoint study revealed that the optimal price for Designer Cloud was pretty close to the current net price of the on-prem product. The lite product was found to not be particularly accretive to overall revenue at the current stage of product maturity. Said differently, it didn’t make strategic sense to have an unlimited-use lower capability cloud offering.

On the consumption model side, we shortlisted three main candidate usage metrics. These were 1) workflows run (i.e. queries or jobs in Alteryx parlance), 2) vCPU hours consumed, and 3) A synthetic Alteryx consumption unit that was a combination of the first two and a few additional metrics (storage GBs and compute type).

Customers were more accepting of the first two of these metrics and showed some hesitancy to accept a synthetic usage metric which was more complex and perceived as opaque. The bigger challenge however was that the usage patterns followed a power-law distribution. This meant that the mean consumption per user was almost 10x the median consumption per user and while the dollar equivalence values of the mean was reasonable, they were quite high for the median usage pattern. While this wasn’t entirely shocking – after all power users exist across most offerings – it did surface a big strategic dilemma for management which is the core of what any application layer subscription-based company considering a usage-based model will likely have to contend with. If we were to hypothetically flip the business model over to a pure consumption-based one overnight, we would essentially be asking 20% of our users to pay for ~80% of our ARR while the remaining 80% of customers would pay for the other 20% of ARR. While a customer Pareto distribution like this isn’t uncommon for a software business, the key difference is that usage-based Pareto didn’t entirely overlap with the existing customer ARR Pareto distribution. Driving such a massive change in customer-level ARR was clearly not feasible and illustrated that value realized and willingness-to-pay for our products was not linearly correlated to usage. There were a sizable number of use-cases where less frequent usage was offset by the high value received per-use of the product, something that we couldn’t reconcile with a pure consumption-based commercial model.

This brings us to the third model we considered (i.e. a hybrid consumption model). This model involved a combination of fixed platform fees, subscription fees per user and consumption volume and could be tiered by good/better/best. Essentially, the revenue equation for this model worked as follows:

Total Bookings = Fixed platform fee + (blended price / user/ yr * # of users) + (price per unit of usage * total usage units)

Ultimately we discovered that this model was the best fit for the cloud platform for a number of reasons:

The fixed platform fee helped offset some of the revenue erosion from the on-prem server migrations but we could make it sufficiently low to achieve a more frictionless new logo expansion motion that we had laid out as part of our strategic objectives.
- Though we discovered that unlimited use and feature-fenced good/better/best tiers would not be accretive to the overall business, we found that by fencing tiers on usage, we could create a series of subscription tiers that aligned price with willingness-to-pay across a heterogenous user base. The ‘lite’ offering in this case was a subscription package that had full feature functionality but limited manual workflow runs. The better package had a much higher cap on manual workflow runs that were sufficient for the majority of users. The best version offered the highest value feature addon (scheduling) and came with unlimited manual runs.
- In addition to manual runs, the greatest value of the platform offering was automated jobs through the orchestrator which could be used to automate all kinds of manual analytic processes. Though the base platform fee came with a fixed allowance of automated runs, this dimension represented the variable consumption monetization vector for the platform that offered a path to monetize power usage. Most importantly, it was aligned with the value-creation potential of the platform.

While we iterated a bit on the specific price levels and usage thresholds that we fenced different versions on, the overall roll-out was smooth and well-received by the market. Though it is still relatively early days for the cloud business, this model promises to continue to power growth of the cloud business well into the future.

Leave a Reply Cancel reply