Scoring the score sheets
I have judged in barista competitions, cupped with roasters worldwide, embarked on Q-grader certification and now it’s the Cup of Excellence competition in El Salvador. What they have in common is the formal scoring and evaluation of coffee, but how does this work?
Purpose versus Pleasure
If coffee is being scored formally, it is usually in the context of tangible business outcomes that the taster hopes to achieve.The usefulness of any system of coffee tasting and evaluation has to be determined by reference to its goals. Some commentators have branded “cupping” a useless exercise because the coffee tasted from a cupping bowl does not taste the same as espresso. As well as ignoring the many talented roasters with the mental agility to translate results from a cupping bowl to espresso, this statement also ignores the many instances in which the goal of cupping is to check for consistency, eliminate defects or select coffee for brewing methods other than espresso.
In producing countries, the “cupping” method is often used for its simplicity and the ability to taste many coffees at a time. Ground coffee is steeped for four minutes or so, the «crust» that floats to the top is broken with a spoon to release a burst of aroma, any coffee floating to the top is skimmed and the liquor is slurped with a spoon as it cools. The roast level will be selected according to the purpose of the cupping. If the goal is simply eliminating defective coffee, a very light roast level might be suitable - flavour will not be fully developed,but defects such as mouldiness will stand out very clearly against the clean cup profile produced. If the goal is to match coffee with buyers or to decide to blend lots, the roast might be taken slightly darker to develop a little more flavour, though the standardised roast level used is much lighter than we are used to for espresso. These processes are crucial to maximise returns from coffee sales.
In consuming countries, roasters and importers might cup coffee to select lots to buy. Usually the roast level will be selected for flavour development. Doing it well is an exercise in predicting the future:coffee buyers need to avoid buying a lot that tastes great now, but will fade in a few months and telltale signs might be nothing more than a little bit of astringency in the aftertaste. Shrewd roaster buyers also need to consider whether a given coffee will play nicely with others in a blend and how it will turn out when roasted for espresso. Once the coffee is purchased and blends are rolled out, good coffee roasters will taste each batch for consistency and quality. Multiple batches of the same coffee might have slight variations and tasting can reveal if it is necessary to discard them or blend them together. Once the coffee leaves the roaster, it might be scored by reviewers or to give feedback to baristas.
Evaluating versus Describing
In some instances, it may be enough to simply evaluate a coffee as good or bad. An example occurs when cupping to screen out defects or to verify that production roasts have resulted in a cup profile that fits the profile envisaged for a retail product. Most of the time, though, the questions to be answered when tasting coffee are both “how good is it?” and “what does it taste like?” Looking at a score alone can be dangerous, particularly for consumers. For instance, presume that a particular high-scoring coffee has high body, low acidity, a fair bit of sweetness and some nutty characteristics. A coffee roaster buying this coffee off points alone might well end up with coffee that is similar to what is already on hand. A consumer buying off points alone might be disappointed if they were hoping for a brighter and fruitier coffee.
Most scoring systems ask reviewers to assign a numerical value to attributes such as acidity, body, clean cup character, sweetness, flavour and aftertaste. These values are then added together to give a total score. These score systems work well where enough thought has been put into the relative weightings of each attribute so that the total score gives a fair indication of suitability for the purpose for which the coffee is being scored on that form. A number of forms exist and some roasteries have made their own internal cupping forms with weightings reflective of the sorts of coffees that they would like to offer; for example, in one form sweetness is given twice the weighting of any other attribute. Most forms also contain a field such as “overall” or “cuppers’ correction” that tasters can use to adjust the total score if some particular attribute of the coffee is particularly delicious or offensive.
Professionals that take the time to learn how to use the many forms available find that they can get an excellent picture of how a coffee will taste based on how it is scored. Conversely, some contend that trying to measure individual attributes distorts the overall score. Clearly, those people are either using scoresheets that are not suited to their purposes, do not understand the score sheets that they are using or simply find it more convenient to have people take their word without having to provide a measure of transparency or accountability.
Intensity versus Preference
There are different schools of thought on whether the intensity of an attribute or the level of preference for an attribute should be scored. Take acidity for example. Higher acidity is not necessarily better - sour coffees can be unpleasant. Any system that scores based purely on intensity must have some method to ensure that high intensity attributes that are not pleasant can be corrected in the overall score, such as by using cuppers’ corrections or by having an overall score that does not depend fully on the sum of the scores for each attribute. Scoring for preference solves this problem, but creates another problem: the intensity of the attribute is hidden. Sure, the coffee had great acidity, but how intense was it? One approach to this problem is to record both intensity and preference, but to score based on preference.
Numbers versus Words
Score sheets are instantly more approachable when you start to think about the numbers in terms of the words that describe them. On a Cup of Excellence score sheet, 6 out of 8 is of a standard that could be a competition winner. On a World Barista Championship scoresheet, 3 is good and 4 is very good,with half points intermediate between the two. Sometimes scores are expressed to a few decimal places and you might well wonder if anyone can ever score to a level of precision where they can be confident in saying that a coffee deserves an 84.32. In these cases, the decimal points are usually a result of averaging multiple scores. In some contexts, using words throughout can be a simpler way of communicating information than using numbers that need to be interpreted by reference to words anyway.
Taste versus Aroma
It is always interesting to look at the different weightings given to tastes (eg. sweet, salty, sour, bitter) and aromas (eg. chocolatey, fruity, floral). Often, descriptions of retail coffee focus on aroma, almost to the exclusion of taste, whereas most green coffee evaluation score sheets dedicate only a small number of points to aroma and focus largely on sensations like taste and body. Descriptions of both taste and aroma are necessary to get the full picture of a coffee.
For people whose professional lives revolve around the product, scoring and evaluating coffee consistently can be a big deal, but fortunately they never let it take the fun out of it. I have been lucky enough to share some delicious cups of coffee with coffee professionals around the world and though the cup quality is often due to rigorous scoring and evaluation, it always takes a back seat to great conversation!
