Building a Business Case for Speech Applications

White Paper

You’ve heard good things about speech applications and what positive impact it can have on your business, but how do you calculate this? What strategy underpins a robust business case and what factors are relevant? This white paper highlights the key metrics on which to base a reliable business case.

Get the download

Below is an excerpt of "Building a Business Case for Speech Applications". To get your free download, and unlimited access to the whole of, simply log in or join free.


The hard costs

The business case for any speech application is calculated by comparing the aggregate of costs plus the savings for the speech solution, versus the costs of the existing solution and procedures.

Costs typically fall into one of two types: ’up front’ costs (capital expenditure or CAPEX) and ongoing operational costs (operational expenditure or OPEX). It is increasingly popular (especially with a hosted model) to express costs on an operational (OPEX) basis. This generally allows for much easier comparison with the existing costs such as costs of employing agents. (Note: refer to Building a business case for speech applications (hosted) white paper, which covers a hosted business case).

Business case parameters

The following table illustrates the parameters associated with the business case, which are cost savings and revenue generation for speech applications. All of these parameters are essential (with the possible exception of 'soft' metrics such as agent morale), since they are needed to understand the costs of the whole infrastructure and business processes that support the speech application. Only by understanding all the costs can the true impact on the business be measured.

Worked example

The following example illustrates a new speech application that handles payment collection from a caller based on the following assumptions:

  • The payment module is used for a number of services, for example taking payments and ordering products/accessories.
  • An existing DTMF application is in use that performs this function, the speech application replaces this.

Traffic is currently in a steady state and not growing - we can therefore show a decrease in port requirements (rather than an absorption of increased traffic). Note that this is only to make the savings more identifiable - the effect is the same.

Note that, from this total in year one any CAPEX cost of building the application should be subtracted. In subsequent years this is not necessary. The application would fully pay for itself well within 12 months, perhaps even within 6 months, and then deliver ongoing savings.

Soft costs and other intangibles

Hard cash savings are one essential part of obtaining business case approval but may not present the whole business case. There are many soft issues both in terms of improving the work experience of the agents who are taking calls (i.e. by removing more mundane tasks) and for the caller. Prime among these has to be the issue of suitability of purpose. Automating a process – even with considerable cost savings – that then alienates callers and loses their business is clearly a backward step. Accordingly there has to be a gating process that accompanies the financial case and that looks at the qualitative issues and any behavioural aspects of automation that run counter to the financial case. The quality, ease-of-use and overall acceptability of automated services by the intended client base is a prerequisite and it may not be achieved other than by adapting a process over a period of time and obtaining real life feedback and analytics. These issues are the subject of wider consideration and should be addressed within the context of developing a business case.

Points to consider

The key metrics to build a business case have been listed in this paper and, as indicated in the worked example, in instances where a DTMF application is already in use some substantial savings can be made with a speech application.

It is worth considering at this stage the method used to develop the applications in the first place, as using a flexible, feature-rich open standards service creation system can make additional savings in the short and long term. Vicorp’s xMP SCEE (service creation and execution environment) offers additional benefits such as white-box component models which improve time to market, lower cost of development and substantially lower ongoing application maintenance costs. For further information on Vicorp’s service creation system, please refer to

Addendum – The workings

TABLE 1: Equipment

Reduction in call hold time is 14.7% so for illustration assume a capacity reduction requirement of 12% (Rather than 14.7% - since this allows for Erlang formula, which requires a slight increase in capacity to handle ‘blocked’ and ’retried’ calls).

Therefore, the new port requirement is 167 to support this application.

Number of ports saved = 23

This releases an effective £18.4k of port CAPEX (based on a full IVR hardware and software purchase price of £800 per port; these ports can be put to other use and not ’re-purchased’ e.g. in the case of expansion.)

CAPEX saved = £18.4k

This also releases an effective £16.5k annual OPEX based on a monthly port operational cost of £60 per month (power, cooling, support etc.)

OPEX saved = £16.5k


TABLE 2: Call Hold Time

We attribute the reduction in call hold time to the manner in which the caller can more easily input data into the payment application compared to DTMF (Dual-tone multi-frequency).

Firstly, there is a basic delay using DTMF on a mobile phone due to the need to move the phone from within vision/touch to the ear and back.

Secondly, items such as expiry dates and amounts are more easily input by speech and fewer mistakes are made. For example, it is easier to say "May two thousand and ten" than to type "0510" - given that callers will make inputs such as "510", "052010", "05#10#" etc. Ideally the existing DTMF application would cater for these inputs, and even if it does, there is still delay in then confirming what the caller meant.

Thirdly, there are fewer menu choices. The caller will not have to listen to a menu that says "For Visa press 1, Mastercard press 2, Amex press 3...." etc. but will simply be able to say "Mastercard".

Reduction in call length = 15 seconds (14.7%)

To calculate the saving we need to estimate the overall time saved (annually) by this level of reduction and then understand the telecoms/platform costs.

Based on the assumed busy hour requirement of 190 ports we can calculate that the platform is handling 684000 seconds of this application during the busiest hour. We assume that this is 20% of the day's traffic and thus the platform handles 3.42m seconds of this traffic per day. We multiply this by 25 to achieve an average monthly figure (85.5m seconds) and then by 12 for an annual figure (1026 million seconds = 17.1 million minutes of traffic.)

We can therefore calculate that our call length reduction of 14.7% reduces this traffic to 14.5 million minutes, a reduction of 2.51 million minutes.

Reduction in call minutes = 2.51 million minutes

If we assume calls to the caller are free for this application (effectively an 0800 call) then there is a cost to you for the each call made/minute transported. We assume this to be a wholesale rate of 2.5p per minute.

The cost saving of the 2.51 million minute saving is therefore 2.51m x 2.5p = £62.7k


For simplicity we assume that additional telecom costs (internal transport, networks and configuration, CTI and data links) are zero.

TABLE 3: Application fallout / Agent costs

We attribute the savings in application fallout rates and agent time saving due to the ability of the speech application to retain the caller more successfully and complete more successful transactions. This claim is made on the basis of

  • Well-documented evidence that shows callers prefer speech applications and co-operate with them more readily
  • The ease of providing accurate input reduces errors and frustration, which causes further errors and thus reduces the error correction cycles required and/or propensity for the caller to ‘bail out’.
  • The slicker and faster experience encourages callers to complete their task.

We have previously calculated the IVR is handling 17.1 million minutes of payment traffic per annum. This represents 83% of total traffic (17% being handled/corrected by agents). Agent traffic for payments is thus: 3.5 million minutes (we have assumed the time taken to process the enquiry by the agent is the same as the IVR. It could be longer or shorter on average depending on the nature of the 'fall out' from the payment application - e.g. invalid card, incorrect input. Correcting an invalid card could result in a very lengthy call).

We then assume that the speech application reduces the fallout rate from 17% to 13% and thus reduces the 3.5m minutes of agent time to 2.7m minutes, a saving of 0.82 million agent minutes. This represents approximately 485k payment transactions per annum.

We assume that the cost for an agent to handle and process a payment transaction is £0.70. The overall total annual saving is thus 485k x £0.70 = £339.5k


Want more like this?

Want more like this?

Insight delivered to your inbox

Keep up to date with our free email. Hand picked whitepapers and posts from our blog, as well as exclusive videos and webinar invitations keep our Users one step ahead.

By clicking 'SIGN UP', you agree to our Terms of Use and Privacy Policy

side image splash

By clicking 'SIGN UP', you agree to our Terms of Use and Privacy Policy