apca-introduction

The missing introduction to APCA  https://p.ce9e.org/apca-introduction/
git clone https://git.ce9e.org/apca-introduction.git

commit
23d736da71aa8f0ffe7a6af75b57e09276404da3
parent
8f4e00f99fb14f64ced461c86aeb6fb84c2206ca
Author
Tobias Bengfort <tobias.bengfort@posteo.de>
Date
2025-03-21 13:28
rewrite analysis

- leave out some bits about which I am not 100% sure
- leave out some bits that are not that relevant
- restructure

Diffstat

M analysis.md 577 +++++++++++++++++++++----------------------------------------

1 files changed, 198 insertions, 379 deletions


diff --git a/analysis.md b/analysis.md

@@ -1,97 +1,17 @@
    1    -1 # Detailed analysis of APCA (2022-08-05)
   -1     1 # Detailed analysis of APCA
    2     2 
    3    -1 I am a regular web developer with a bachelor's degree in math, but without any
    4    -1 training in the science around visual perception. That's why I cannot evaluate
    5    -1 whether APCA is *better* than WCAG 2.x. Instead this is a systematic
    6    -1 comparison of their mathematical properties.
   -1     3 APCA was developed to address some [issues] related to contrast in the
   -1     4 [Web Content Accessibility Guidelines] (WCAG). WCAG is an official W3C
   -1     5 recommendation, a normative part of many laws all over the world, and
   -1     6 generally a good read.
    7     7 
    8    -1 ## Context: The Web Content Accessibility Guidelines (WCAG)
    9    -1 
   10    -1 APCA was developed to address some issues related to contrast in the [Web
   11    -1 Content Accessibility Guidelines] (WCAG). WCAG is an official W3C
   12    -1 recommendation, a normative part of many laws all over the world, and generally
   13    -1 a good read.
   14    -1 
   15    -1 WCAG faces a difficult challenge though: There is no one-size-fits-all solution
   16    -1 for accessibility. Different humans have different needs, and different
   17    -1 situations require different kinds of support.
   18    -1 
   19    -1 This is also the case in the context of color contrast: vision impairments,
   20    -1 ambient light, and screen settings can all have a pronounced impact on
   21    -1 legibility. None of these are known beforehand by website authors, so the rules
   22    -1 provided by WCAG need to work regardless of these factors.
   23    -1 
   24    -1 Faced with the question whether it wanted to give precise instructions (that
   25    -1 might not be ideal in every situation) or give nuanced but ultimately vague
   26    -1 advise, WCAG went with the former. So today WCAG provides a list of detailed
   27    -1 steps for evaluating a website. Many of these checks can be automated. It does
   28    -1 not always result in perfect accessibility, but it gives lawmakers a solid
   29    -1 baseline.
   30    -1 
   31    -1 ## Components of contrast
   32    -1 
   33    -1 When we speak about contrast, we actually mean a few different things:
   34    -1 
   35    -1 -   How is the contrast between two colors calculated?
   36    -1 -   Which thresholds are used to decide whether that contrast is sufficient?
   37    -1 -   How do other features like font size and font weight factor into that
   38    -1     decision?
   39    -1 -   Which parts of the UI need to be checked?
   40    -1 
   41    -1 In the following sections I will take a closer look at how WCAG 2.x and APCA
   42    -1 answer each of these questions.
   -1     8 I am a regular web developer with a bachelor's degree in math, but
   -1     9 without any training in the science around visual perception. That's why
   -1    10 I cannot evaluate whether APCA is *better* than WCAG 2.x. Instead this
   -1    11 is a systematic comparison of their mathematical properties.
   43    12 
   44    13 ## The contrast formula
   45    14 
   46    -1 > all models are wrong, but some are useful\
   47    -1 > -- George Box
   48    -1 
   49    -1 There is no *true* contrast formula. Instead, these formulas are supposed to
   50    -1 predict how most humans perceive a color combination, even if they cannot be
   51    -1 correct 100% of the time.
   52    -1 
   53    -1 ### A naive approach
   54    -1 
   55    -1 ```js
   56    -1 function sRGBtoJ(srgb) {
   57    -1   return (srgb[0] + srgb[1] + srgb[2]) / 3;
   58    -1 }
   59    -1 
   60    -1 function contrast(fg, bg) {
   61    -1   var jfg = sRGBtoJ(fg);
   62    -1   var jbg = sRGBtoJ(bg);
   63    -1 
   64    -1   return jbg - jfg;
   65    -1 };
   66    -1 ```
   67    -1 
   68    -1 This naive approach provides a baseline for the other formulas we will look at.
   69    -1 It does not consider anything we know about human vision, but it already
   70    -1 features the basic structure: We first transform each color to a value that
   71    -1 represents lightness. Then we calculate a difference between the two lightness
   72    -1 values.
   73    -1 
   74    -1 ### Historical context
   75    -1 
   76    -1 Lightness (J) is a measure for the perceived amount of light. Luminance (Y) is
   77    -1 a measure for the physical amount of light. In order to understand perceived
   78    -1 contrast, we first have to understand the relationship between luminance and
   79    -1 lightness.
   80    -1 
   81    -1 In the nineteenth century, E. H. Weber found that human perception works in
   82    -1 relative terms, i.e. the difference between 100 g and 110 g is perceived the
   83    -1 same as the difference between 1000 g and 1100 g. Applied to vision this means
   84    -1 that a contrast between two color pairs is perceived the same if `(Y1 - Y2) /
   85    -1 Y2` has the same value. This is known as Weber contrast and has been called the
   86    -1 ["gold standard" for text contrast].
   87    -1 
   88    -1 Fechner concluded that the relation between a physical measure `Y` and a
   89    -1 perceived measure `J` can be expressed as `J = a * log(Y) + b`. This is called
   90    -1 the Weber-Fechner law.
   91    -1 
   92    -1 In 1961 Stevens published a different model that was found to more accurately
   93    -1 predict human vision. It has the form `J = a * pow(Y, alpha) + b`.[^1]
   94    -1 
   95    15 ### WCAG 2.x
   96    16 
   97    17 ```js
@@ -120,29 +40,42 @@ function contrast(fg, bg) {
  120    40 };
  121    41 ```
  122    42 
  123    -1 In WCAG 2.x we see the same general structure as in the naive approach, but the
  124    -1 individual steps are more complicated:
   -1    43 Colors on the web are defined in the [sRGB color space]. The first part
   -1    44 of this formula is the official formula to convert a sRGB color to
   -1    45 luminance (Y). Doubling sRGB values (e.g. from `#444` to `#888`) does
   -1    46 not actually double the physical amount of light, so the first step is a
   -1    47 non-linear "gamma decoding". Then the red, green, and blue channels are
   -1    48 weighted to sum to the final luminance.
   -1    49 
   -1    50 Next, 0.05 is added to both values. I am not exactly sure how to
   -1    51 interpret this parameter. It could model ambient light that is reflected
   -1    52 on the screen (flare).[^1] Or it might model the fact that black on a
   -1    53 screen is not total black. Or it might just be a numerical trick to
   -1    54 avoid deviding by zero. We will discuss the impact of this parameter
   -1    55 later.
   -1    56 
   -1    57 Then we look at the ratio of these two values. I hope I can convince you
   -1    58 that these two statements are equivalent:
  125    59 
  126    -1 Colors on the web are defined in the [sRGB color space]. The first part of this
  127    -1 formula is the official formula to convert a sRGB color to luminance. Doubling
  128    -1 sRGB values (e.g. from `#444` to `#888`) does not actually double the physical
  129    -1 amount of light, so the first step is a non-linear "gamma decoding". Then the
  130    -1 red, green, and blue channels are weighted to sum to the final luminance.
   -1    60     ybg / yfg > 4.5
   -1    61     log(ybg) - log(yfg) > log(4.5)
  131    62 
  132    -1 Next, 0.05 is added to both values to account for ambient light that is
  133    -1 reflected on the screen (flare). Since we are in the domain of physical light,
  134    -1 we can just add these values. 0.05 means that we assume that the flare amounts
  135    -1 to 5% of the white of the screen.[^2]
   -1    63 The first one is sometimes called "simple contrast", and this is what is
   -1    64 used in WCAG 2.x. The second one uses the [Weber-Fechner model] to
   -1    65 convert luminance to perceived lightness (J). Technically, the
   -1    66 Weber-Fechner law is `J = a * log(y) + b`, but `b` vanishes in the
   -1    67 difference and `a` would just scale the whole equation.
  136    68 
  137    -1 Then the Weber contrast is calculated. Note that `(Y1 - Y2) / Y2` is the same
  138    -1 as `Y1 / Y2 - 1`. The shift by 1 is removed because it has no impact on the
  139    -1 results (as long as the thresholds are adapted accordingly).
   -1    69 Given that the two are equivalent, you can think of this contrast ratio
   -1    70 as a lightness difference. We just convert the values to avoid having to
   -1    71 calculate logarithms.
  140    72 
  141    -1 Finally, the polarity is removed so that the formula has the same results when
  142    -1 the two colors are switched.
   -1    73 Finally, the polarity is removed so that the formula has the same
   -1    74 results when the two colors are switched.
  143    75 
  144    -1 All in all this is a pretty solid contrast formula (at least from a theoretical
  145    -1 perspective), as it just reuses parts from well established standards.
   -1    76 All in all this is a pretty solid contrast formula (at least from a
   -1    77 theoretical perspective), as it just reuses parts from well established
   -1    78 standards. The only odd bit is the `0.05` offset.
  146    79 
  147    80 ### APCA
  148    81 
@@ -162,14 +95,15 @@ function sRGBtoY(srgb) {
  162    95 function contrast(fg, bg) {
  163    96   var yfg = sRGBtoY(fg);
  164    97   var ybg = sRGBtoY(bg);
  165    -1   var c = 1.14;
  166    98 
  167    99   if (ybg > yfg) {
  168    -1     c *= Math.pow(ybg, 0.56) - Math.pow(yfg, 0.57);
   -1   100     var c = Math.pow(ybg, 0.56) - Math.pow(yfg, 0.57);
  169   101   } else {
  170    -1     c *= Math.pow(ybg, 0.65) - Math.pow(yfg, 0.62);
   -1   102     var c = Math.pow(ybg, 0.65) - Math.pow(yfg, 0.62);
  171   103   }
  172   104 
   -1   105   c *= 1.14;
   -1   106 
  173   107   if (Math.abs(c) < 0.1) {
  174   108     return 0;
  175   109   } else if (c > 0) {
@@ -182,29 +116,31 @@ function contrast(fg, bg) {
  182   116 };
  183   117 ```
  184   118 
  185    -1 The conversion from sRGB to luminance uses similar coefficients, but the
  186    -1 non-linear part is very different. The author of APCA provides some motivation
  187    -1 for these changes in the article [Regarding APCA Exponents]. The main argument
  188    -1 seems to be that this is supposed to more closely model real-world computer
  189    -1 screens. This step also incorporates a flare of `Math.pow(0.022, 1.414) ~=
  190    -1 0.0045`.
  191    -1 
  192    -1 Next, the contrast is calculated based on the Stevens model. Interestingly,
  193    -1 APCA uses four different exponents for light foreground (0.62), dark foreground
  194    -1 (0.57), light background (0.56), and dark background (0.65).
  195    -1 
  196    -1 The final steps do some scaling and shifting that only serves to get nice
  197    -1 threshold values. Just like the shift by 1 in the WCAG formula, this does not
  198    -1 affect results as long as the thresholds are adapted accordingly. Note that the
  199    -1 `< 0.1` condition only affects contrasts that are below the lowest threshold
  200    -1 anyway.
  201    -1 
  202    -1 This formula is based on the more modern Stevens model, but also has some
  203    -1 unusual parts. The non-standard `sRGBtoY` is hard to evaluate without further
  204    -1 information on how it was derived. All of the exponents are significantly
  205    -1 higher than the common 1/3. Analysis is also complicated by the fact that the
  206    -1 three levels of exponents (gamma, alpha, different exponents for light/dark
  207    -1 foreground/background) are not clearly separated.
   -1   119 We again see a similar structure here: Convert sRGB colors to luminance;
   -1   120 convert luminance to perceived lightness; then calculate the difference.
   -1   121 But there are also some interesting differences.
   -1   122 
   -1   123 The conversion from sRGB to luminance deviates from the sRGB standard.
   -1   124 Especially the non-linear part is very different. The author of APCA
   -1   125 provides some motivation for these changes in the article [Regarding
   -1   126 APCA Exponents] and in a [github issue]. The main argument seems to be
   -1   127 that this is supposed to more closely model real-world computer screens.
   -1   128 This step also makes sure that the smallest possible value is
   -1   129 `Math.pow(0.022, 1.414) ~= 0.0045`.
   -1   130 
   -1   131 The conversion to lightness uses the more modern [Stevens model]
   -1   132 `J = a * pow(Y, alpha) + b` instead of Weber-Fechner.[^2] Interestingly,
   -1   133 APCA uses four different exponents for light foreground (0.62), dark
   -1   134 foreground (0.57), light background (0.56), and dark background (0.65).
   -1   135 Unfortunately, the three levels of exponents (gamma, alpha, different
   -1   136 exponents for light/dark foreground/background) are not clearly
   -1   137 separated, which makes analysis more complicated.
   -1   138 
   -1   139 The final steps do some scaling and shifting that (in my understanding)
   -1   140 only serves to get nice threshold values. Just like the `log()`
   -1   141 conversion in the WCAG formula, this does not affect results as long as
   -1   142 the thresholds are adapted accordingly. Note that the `< 0.1` condition
   -1   143 only affects contrasts that are below any relevant thresholds anyway.
  208   144 
  209   145 ### Normalization
  210   146 
@@ -247,10 +183,6 @@ function contrast(fg, bg) {
  247   183 
  248   184   return jbg - jfg;
  249   185 };
  250    -1 
  251    -1 function normalize(c) {
  252    -1   return Math.log(c) / Math.log(21);
  253    -1 }
  254   186 ```
  255   187 
  256   188 APCA becomes:
@@ -285,292 +217,179 @@ function contrast(fg, bg) {
  285   217     return Math.pow(jbg, 0.65 / 0.6) - Math.pow(jfg, 0.62 / 0.6);
  286   218   }
  287   219 };
  288    -1 
  289    -1 function normalize(c) {
  290    -1   return (c / 100 + 0.027) / 1.14;
  291    -1 }
  292   220 ```
  293   221 
  294   222 ### Comparison
  295   223 
  296    -1 Now that we have aligned the two formulas, what are the actual differences?
  297    -1 
  298    -1 ![contrast comparison](plots/contrast_comparison.png)
  299    -1 
  300    -1 These are scatter plots based on a random sample of color pairs. The x-axis
  301    -1 corresponds to background luminance, the y-axis corresponds to foreground
  302    -1 luminance (both using the APCA formula). The color of the dots indicated the
  303    -1 differences between the respective formulas.
  304    -1 
  305    -1 The plot on the bottom right compares APCA to WCAG 2.x. As we can see, the
  306    -1 biggest differences are in areas where one color is extremely light or
  307    -1 extremely dark. For light colors, APCA predicts an even higher contrast
  308    -1 (difference is in the same direction as contrast polarity). For dark colors,
  309    -1 APCA predicts a lower contrast (difference is inverse to contrast polarity).
  310    -1 The difference goes up to 20%.
  311    -1 
  312    -1 The other three plots compare APCA to a modified version of APCA where one of
  313    -1 the steps has been replaced by the corresponding step from WCAG 2.x. This way
  314    -1 we can see that `sRGBtoY` contributes 4% to the difference, `YtoJ` contributes
  315    -1 15%, and `contrast` contributes 3%.
  316    -1 
  317    -1 Since the conversion from luminance to lightness causes the biggest difference,
  318    -1 I took a closer look at it.
  319    -1 
  320    -1 ![lightness comparison](plots/lightness_comparison.png)
  321    -1 
  322    -1 I plotted curves for both the Weber-Fechner model (log) and the Stevens model
  323    -1 (pow) with different parameters.
  324    -1 
  325    -1 -   The log curve with a flare of 0.05 (WCAG 2) is closer to the pow curve with
  326    -1     an exponent of 1/3 and a flare of 0.0025
  327    -1 -   The log curve with a flare of 0.4 is closer to the pow curves with
  328    -1     exponents 0.56 and 0.68 and a flare of 0.0045 (similar to APCA)
  329    -1 
  330    -1 This shows that a big part of the different results between WCAG 2.x and APCA
  331    -1 are caused by a different choice in parameters. If we were to change the flare
  332    -1 value in WCAG 2.x to 0.4 we would get results much closer to APCA. And if we
  333    -1 were to change the exponents in APCA to 1/3 we would get results much closer to
  334    -1 WCAG 2.x.
  335    -1 
  336    -1 Our vision adapts to the lighting conditions. In the dark, we are much better
  337    -1 at discerning dark colors. A lower exponent models these darker conditions.
  338    -1 The most complete (but also most complex) color appearance model currently
  339    -1 available is [CIECAM02](https://en.wikipedia.org/wiki/CIECAM02). It uses
  340    -1 exponents between 0.31 and 0.72. Given that model, WCAG 2.x is on the lower
  341    -1 (darker) end of possible exponents, while APCA goes to the other (lighter)
  342    -1 extreme. This is consistent with the observation that APCA reports lower
  343    -1 contrast for darker colors.
  344    -1 
  345    -1 It seems reasonable to use a lower exponent for light-on-dark color pairs.
  346    -1 First, because the background color itself is often a significant part of the
  347    -1 lighting conditions, and second, because some software automatically switches
  348    -1 to dark mode at night. Surprisingly, APCA does the exact opposite and uses
  349    -1 higher exponents in that situation.
  350    -1 
  351    -1 ## Spatial frequency
  352    -1 
  353    -1 Smaller text is generally harder to read than bigger text. In a more general
  354    -1 sense, we can speak about the spatial frequency of features. This is usually
  355    -1 measured in cycles per degree (cpd), since the visual field is measured as an
  356    -1 angle.
  357    -1 
  358    -1 If content is easy to read because of its spacial frequency, I do not need as
  359    -1 much color contrast. On the other hand, if the spatial frequency is bad, more
  360    -1 color contrast is needed.
  361    -1 
  362    -1 There is one caveat though: The spatial frequency only defines the contrast
  363    -1 threshold under which a pattern is not perceivable at all. Above that it has
  364    -1 barely any effect. So the best way to use it is to define a minimum required
  365    -1 color contrast based on spatial frequency.[^3]
  366    -1 
  367    -1 Interestingly, a lower spatial frequency is not always easier to read though.
  368    -1 [Studies have shown] that the optimal spatial frequency is at about 5-7 cycles
  369    -1 per degree. Below that, features get slightly harder to detect. (Perhaps that
  370    -1 is the reasons for the "you don't see the forest among the trees" phenomenon.)
  371    -1 
  372    -1 It is not obvious how to define spatial frequency in the context of the web.
  373    -1 For text, font size and weight certainly play a role. But different fonts have
  374    -1 wildly different interpretations of these values. Since fonts depend on user
  375    -1 preference, we cannot know beforehand which fonts will be used. We also don't
  376    -1 know the size of device pixels or how far the user is from the screen.
  377    -1 
  378    -1 So how do WCAG 2.x and APCA tackle this topic?
  379    -1 
  380    -1 ### WCAG 2.x
  381    -1 
  382    -1 WCAG 2.x makes the distinction between regular and [large text]. Large text is
  383    -1 defined as anything above 18 point or 14 point bold. The definition comes with
  384    -1 a lot of notes that explain the limits of that approach though, e.g. that some
  385    -1 fonts are extremely thin.
  386    -1 
  387    -1 WCAG 2.x also comes with some rules that allow users to adapt spatial frequency
  388    -1 to their needs: [1.4.4] requires that users can resize the text, [1.4.10]
  389    -1 requires that they can zoom the whole page, and [1.4.12] requires that they can
  390    -1 adjust text spacing.
  391    -1 
  392    -1 So WCAG 2.x doesn't really attempt to model spatial frequency for web content.
  393    -1 It elegantly works around the issue by handing control over to the users who
  394    -1 have all the facts.
   -1   224 Now that we have aligned the two formulas, we can compare them
   -1   225 numerically:
  395   226 
  396    -1 ### APCA
  397    -1 
  398    -1 Conversely, APCA [does attempt to model spatial frequency]:
   -1   227 ![contrast comparison]
  399   228 
  400    -1 1.  If the font has an x-height ratio of less than 0.52, increase the size by a
  401    -1     factor of `0.52 / xHeightRatio`.
  402    -1 2.  Experimentally find a weight offset so the font has a similar weight to
  403    -1     Arial or Helvetica.
  404    -1 3.  Consider additional font features and adapt the values accordingly.
  405    -1 4.  Use the lookup table provided at the link above to find a minimum contrast
  406    -1     for the given combination of size and weight.
   -1   229 These are scatter plots based on a random sample of color pairs. The
   -1   230 x-axis corresponds to background luminance, the y-axis corresponds to
   -1   231 foreground luminance (both using the APCA formula). The color of the
   -1   232 dots indicated the differences between the respective formulas.
  407   233 
  408    -1 WCAG 3 is still an early draft and does not yet contain many guidelines. I
  409    -1 assume that guidelines similar to 1.4.4, 1.4.10, and 1.4.12 will again be
  410    -1 included. So the strategy of giving users control over spatial frequency will
  411    -1 still work.
   -1   234 The plot on the bottom right compares APCA to WCAG 2.x. As we can see,
   -1   235 the biggest differences are in areas where one color is extremely light
   -1   236 or extremely dark. For light colors, APCA predicts an even higher
   -1   237 contrast (difference is in the same direction as contrast polarity). For
   -1   238 dark colors, APCA predicts a lower contrast (difference is inverse to
   -1   239 contrast polarity). The difference goes up to 20%.
  412   240 
  413    -1 With the more sophisticated link between spatial frequency and color contrast,
  414    -1 user intervention might be less relevant though. However, the model described
  415    -1 above is complicated and leaves a lot of wiggle room, especially in steps 2 and
  416    -1 3.
   -1   241 The other three plots compare APCA to a modified version of APCA where
   -1   242 one of the steps has been replaced by the corresponding step from
   -1   243 WCAG 2.x. This way we can see that `sRGBtoY` contributes 4% to the
   -1   244 difference, `YtoJ` contributes 15%, and `contrast` contributes 3%.
  417   245 
  418    -1 ## Non-text contrast
   -1   246 Since the conversion from luminance to lightness causes the biggest
   -1   247 difference, I took a closer look at it.
  419   248 
  420    -1 So far we have mainly looked at text. But other parts of a website also need to
  421    -1 be distinguishable. The concept of spatial frequency was explicitly picked
  422    -1 because it can cover those cases. What do WCAG 2.x and APCA have to say about
  423    -1 this?
   -1   249 ![lightness comparison]
  424   250 
  425    -1 ### WCAG 2.x
   -1   251 I plotted curves for both the Weber-Fechner model (log) and the Stevens
   -1   252 model (pow) with different parameters.
  426   253 
  427    -1 [1.4.11] is specifically about this issue. It basically says that all non-text
  428    -1 content that is not inactive, decorative, or controlled by the browser must
  429    -1 meet contrast requirements. Spatial frequency is not considered in this case.
  430    -1 It is also not always clear which parts of the UI are decorative and which are
  431    -1 actually relevant.
   -1   254 -   Both models produce curves with a concave shape.
   -1   255 -   The log curve with an offset of 0.05 (WCAG 2) is closer to the pow
   -1   256     curve with an exponent of 1/3 and an offset of 0.0025
   -1   257 -   The log curve with an offset of 0.4 is closer to the pow curves with
   -1   258     exponents 0.56 and 0.68 and an offset of 0.0045 (similar to APCA)
  432   259 
  433    -1 ### APCA
   -1   260 This shows that a big part of the different results between WCAG 2.x and
   -1   261 APCA are caused by a different choice in parameters. If we were to
   -1   262 change the offset value in WCAG 2.x to 0.4 we would get results much
   -1   263 closer to APCA. And if we were to change the exponents in APCA to 1/3 we
   -1   264 would get results much closer to WCAG 2.x.
  434   265 
  435    -1 As of today, APCA focusses mostly on text. Its sophisticated approach to
  436    -1 spatial frequency has a lot of potential for non-text content. I could not yet
  437    -1 find any discussion of that though.
   -1   266 Our vision adapts to the lighting conditions. In the dark, we are much
   -1   267 better at discerning dark colors. A lower exponent models these darker
   -1   268 conditions. The most complete (but also most complex) color appearance
   -1   269 model currently available is [CIECAM02]. It uses exponents between 0.31
   -1   270 and 0.72. Given that model, WCAG 2.x is on the lower (darker) end of
   -1   271 possible exponents, while APCA goes to the other (lighter) extreme. This
   -1   272 is consistent with the observation that APCA reports lower contrast for
   -1   273 darker colors.
  438   274 
  439    -1 ## Thresholds
   -1   275 ## How much contrast is enough contrast?
  440   276 
  441   277 ### WCAG 2.x
  442   278 
  443    -1 WCAG 2.x defines 3 thresholds: 3, 4.5, and 7.
   -1   279 Smaller text is generally harder to read than bigger text. Generally
   -1   280 speaking, smaller text requires more contrast to be legible.[^3] And
   -1   281 body text can be read faster with better contrast.
  444   282 
  445    -1 -   non-text content must have a contrast of at least 3
  446    -1 -   large text must have a contrast of at least 3 (AA) or 4.5 (AAA)
  447    -1 -   other text must have a contrast of at least 4.5 (AA) or 7 (AAA)
  448    -1 -   logos and inactive or decorative elements are exempted
   -1   283 WCAG 2.x requires a contrast of at least 4.5:1 for regular text and 3:1
   -1   284 for large text. If you aim for AAA conformance, the values are 7:1 and
   -1   285 4.5:1.
  449   286 
  450   287 How these values were derived is not completely clear:
  451   288 
  452   289 > There was some user testing associated with the validation of the 2.0
  453    -1 > formula. I could not quickly find a cite for that. My recollection is that
  454    -1 > the hard data pointed to a ratio of 4.65:1 as a defensible break point. The
  455    -1 > working group was close to rounding that up to 5:1, just to have round
  456    -1 > numbers. I successfully lobbied for 4.5:1 mostly because (1) the empirical
  457    -1 > data was not overwhelmingly compelling, and (2) 4.5:1 allowed the option for
  458    -1 > white and black (simultaneously) on a middle gray.\
   -1   290 > formula. I could not quickly find a cite for that. My recollection is
   -1   291 > that the hard data pointed to a ratio of 4.65:1 as a defensible break
   -1   292 > point. The working group was close to rounding that up to 5:1, just to
   -1   293 > have round numbers. I successfully lobbied for 4.5:1 mostly because
   -1   294 > (1) the empirical data was not overwhelmingly compelling, and (2)
   -1   295 > 4.5:1 allowed the option for white and black (simultaneously) on a
   -1   296 > middle gray.\
  459   297 > -- <https://github.com/w3c/wcag/issues/695#issuecomment-484187617>
  460   298 
  461    -1 ### APCA
   -1   299 [Large text] is defined as anything above 18 point or 14 point bold. The
   -1   300 definition comes with a lot of notes that explain the limits of that
   -1   301 approach though, e.g. that some fonts are extremely thin and that font
   -1   302 size depends on user settings.
  462   303 
  463    -1 APCA defines 6 thresholds: 15, 30, 45, 60, 75, 90.
   -1   304 WCAG 2.x also comes with some rules that allow users to adapt font size
   -1   305 to their needs: [1.4.4] requires that users can resize the text,
   -1   306 [1.4.10] requires that they can zoom the whole page, and [1.4.12]
   -1   307 requires that they can adjust text spacing.
  464   308 
  465    -1 The required threshold depends on the spatial frequency (see above). 45, 60,
  466    -1 and 75 loosely correspond to 3, 4.5, and 7 in WCAG 2.x.
   -1   309 So in a way, WCAG 2.x side-stepps the issue by handing control over to
   -1   310 the users who have all the facts.
  467   311 
  468    -1 ### Comparison
   -1   312 ### APCA
  469   313 
  470    -1 Again I generated random color pairs and used them to compare APCA to WCAG 2.x:
  471    -1 
  472    -1 |         |    < 15 |   15-30 |   30-45 |   45-60 |   60-75 |   75-90 |    > 90 |   total |
  473    -1 | -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:|
  474    -1 |     < 3 |  34.8\* |  25.1\* |  11.7\* |     1.5 |     0.0 |     0.0 |     0.0 |    73.0 |
  475    -1 |   3-4.5 |     0.0 |     0.7 |     6.3 |   6.7\* |     0.9 |     0.0 |     0.0 |    14.5 |
  476    -1 |   4.5-7 |     0.0 |     0.0 |     0.7 |     3.9 |   3.9\* |     0.2 |     0.0 |     8.7 |
  477    -1 |     > 7 |     0.0 |     0.0 |     0.0 |     0.3 |     1.7 |   1.6\* |   0.2\* |     3.8 |
  478    -1 |   total |    34.8 |    25.8 |    18.6 |    12.3 |     6.5 |     1.8 |     0.2 |  83.9\* |
  479    -1 
  480    -1 The columns correspond to APCA thresholds, the rows correspond to WCAG 2.x
  481    -1 thresholds. For example, 6.3 % of the generated color pairs pass WCAG 2.x with
  482    -1 a contrast above 3, but fail APCA with a contrast below 45.
  483    -1 
  484    -1 The \* indicate cases where both a algorithms agree on a threshold level. The
  485    -1 cell in the bottom right is the total number of cases where both algorithms
  486    -1 agree, so it can be seen as an indicator of how similar the algorithms are.
  487    -1 
  488    -1 |         |    < 15 |   15-30 |   30-45 |   45-60 |   60-75 |   75-90 |    > 90 |   total |
  489    -1 | -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:|
  490    -1 |   < 3.5 |  34.8\* |  25.7\* |  15.1\* |     4.1 |     0.1 |     0.0 |     0.0 |    79.6 |
  491    -1 | 3.5-3.5 |     0.0 |     0.1 |     3.5 |   6.5\* |     2.4 |     0.0 |     0.0 |    12.6 |
  492    -1 |   5.5-6 |     0.0 |     0.0 |     0.1 |     1.7 |   3.3\* |     0.4 |     0.0 |     5.4 |
  493    -1 |     > 8 |     0.0 |     0.0 |     0.0 |     0.1 |     0.8 |   1.4\* |   0.2\* |     2.4 |
  494    -1 |   total |    34.8 |    25.8 |    18.6 |    12.3 |     6.5 |     1.8 |     0.2 |  86.9\* |
  495    -1 
  496    -1 The second table again compares APCA to WCAG 2.x, but this time I tweaked the
  497    -1 thresholds to minimize the difference. This shows that some of the difference
  498    -1 is caused by the choice of thresholds, not the formula itself.
  499    -1 
  500    -1 |         |    < 15 |   15-30 |   30-45 |   45-60 |   60-75 |   75-90 |    > 90 |   total |
  501    -1 | -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:|
  502    -1 |   < 1.6 |  33.3\* |     0.7 |     0.0 |     0.0 |     0.0 |     0.0 |     0.0 |    34.0 |
  503    -1 | 1.6-2.5 |     1.4 |  23.5\* |     0.7 |     0.0 |     0.0 |     0.0 |     0.0 |    25.6 |
  504    -1 | 2.5-3.9 |     0.0 |     1.6 |  16.8\* |     0.5 |     0.0 |     0.0 |     0.0 |    18.9 |
  505    -1 |   3.9-6 |     0.0 |     0.0 |     1.1 |  11.2\* |     0.3 |     0.0 |     0.0 |    12.6 |
  506    -1 |     6-9 |     0.0 |     0.0 |     0.0 |     0.6 |   5.9\* |     0.1 |     0.0 |     6.6 |
  507    -1 |    9-13 |     0.0 |     0.0 |     0.0 |     0.0 |     0.3 |   1.7\* |     0.0 |     2.0 |
  508    -1 |    > 13 |     0.0 |     0.0 |     0.0 |     0.0 |     0.0 |     0.1 |   0.2\* |     0.2 |
  509    -1 |   total |    34.8 |    25.8 |    18.6 |    12.3 |     6.5 |     1.8 |     0.2 |  92.5\* |
  510    -1 
  511    -1 The third table compares APCA to a modified WCAG 2.x contrast with a flare
  512    -1 value of 0.4. As expected, the difference is reduced significantly, though
  513    -1 there is still a considerable difference left.
  514    -1 
  515    -1 |         |    < 15 |   15-30 |   30-45 |   45-60 |   60-75 |   75-90 |    > 90 |   total |
  516    -1 | -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:| -------:|
  517    -1 |    < 15 |  33.6\* |     1.3 |     0.0 |     0.0 |     0.0 |     0.0 |     0.0 |    34.9 |
  518    -1 |   15-30 |     1.2 |  23.2\* |     1.2 |     0.0 |     0.0 |     0.0 |     0.0 |    25.5 |
  519    -1 |   30-45 |     0.0 |     1.3 |  16.3\* |     1.1 |     0.0 |     0.0 |     0.0 |    18.8 |
  520    -1 |   45-60 |     0.0 |     0.0 |     1.2 |  10.4\* |     0.9 |     0.0 |     0.0 |    12.5 |
  521    -1 |   60-75 |     0.0 |     0.0 |     0.0 |     0.8 |   5.4\* |     0.2 |     0.0 |     6.4 |
  522    -1 |   75-90 |     0.0 |     0.0 |     0.0 |     0.0 |     0.3 |   1.6\* |     0.1 |     1.9 |
  523    -1 |    > 90 |     0.0 |     0.0 |     0.0 |     0.0 |     0.0 |     0.0 |   0.1\* |     0.1 |
  524    -1 |   total |    34.8 |    25.8 |    18.6 |    12.3 |     6.5 |     1.8 |     0.2 |  90.4\* |
  525    -1 
  526    -1 The last table compares APCA to itself, but with foreground and background
  527    -1 switched. WCAG 2.x does not make a difference between foreground and
  528    -1 background, so this comparison would be pointless there. APCA on the other hand
  529    -1 uses different exponents for foreground and background. This table shows that
  530    -1 this does have a small but still significant impact on the results.
   -1   314 APCA on the other has a much more sophisticated threshold system. It
   -1   315 provides a [table] that defines thresholds based on font size, weight
   -1   316 and whether or not the text is body text. It also tries to account for
   -1   317 different fonts. However, that system is complicated and leaves a lot of
   -1   318 wiggle room.
   -1   319 
   -1   320 Unfortunately, there is next to no information on how that table was
   -1   321 generated, so there is not mach I can say about it.
  531   322 
  532   323 ## Conclusion
  533   324 
  534    -1 In this analysis I took a deeper look at the Accessible Perceptual Contrast
  535    -1 Algorithm (APCA), a new algorithm to predict visual contrast. I compared it to
  536    -1 an existing algorithm that has been part of WCAG 2.x, the current standard for
  537    -1 accessibility testing for the web.
   -1   325 > all models are wrong, but some are useful\
   -1   326 > -- George Box
  538   327 
  539    -1 Though still in early development, APCA is very different from the older
  540    -1 algorithm in many key aspects:
   -1   328 WCAG faces a difficult challenge: There is no one-size-fits-all solution
   -1   329 for accessibility. Different humans have different needs, and different
   -1   330 situations require different kinds of support.
  541   331 
  542    -1 -   It uses a different luminance calculation that deviates from the standards
  543    -1     but is supposed to be closer to real world usage.
  544    -1 -   It uses the more accurate Stevens model and assumes different lighting
  545    -1     conditions for converting luminance to perceptual lightness.
  546    -1 -   It adds an additional step where different exponents are applied to
   -1   332 This is also the case in the context of color contrast: vision
   -1   333 impairments, ambient light, screen settings, font settings, and zoom can
   -1   334 all have a pronounced impact on legibility. None of these are known
   -1   335 beforehand by website authors, so the rules provided by WCAG need to
   -1   336 work regardless of these factors.
   -1   337 
   -1   338 Faced with the question whether it wanted to give simple and precise
   -1   339 instructions (that might not be ideal in every situation) or give
   -1   340 nuanced but ultimately vague advise, WCAG went with the former. So today
   -1   341 WCAG provides a list of detailed steps for evaluating a website. Many of
   -1   342 these checks can be automated. It does not always result in perfect
   -1   343 accessibility, but it gives lawmakers a solid baseline.
   -1   344 
   -1   345 APCA proposes some changes to the way we calculate color contrasts:
   -1   346 
   -1   347 1.  It uses a different luminance calculation that deviates from the
   -1   348     standards but is supposed to be closer to real world usage.
   -1   349 2.  It uses the more modern Stevens model
   -1   350 3.  It assumes different (lighter) lighting conditions for converting
   -1   351     luminance to perceptual lightness.
   -1   352 4.  It adds an additional step where different exponents are applied to
  547   353     foreground and background.
  548    -1 -   It uses different scaling. Crucially, this scaling is based on a difference
  549    -1     rather than a ratio.
  550    -1 -   It uses a more sophisticated link between spatial frequency and minimum
  551    -1     color contrast that might allow for more nuanced thresholds.
   -1   354 5.  It uses different scaling. Crucially, this scaling is based on a
   -1   355     difference rather than a ratio.
   -1   356 6.  It uses a more sophisticated system for thresholds.
   -1   357 
   -1   358 I am fine with (2) and (5). I am not sure if they justify breaking
   -1   359 changes, but if we want to come up with a new algorithm then I think
   -1   360 these are ideas we should keep.
   -1   361 
   -1   362 Points (1) and (4) need empirical evaluation. Unfortunately, there is
   -1   363 still a [severe lack of publicly available research] on APCA. On
   -1   364 request, APCA's author just provided a [list of unrelated links].
  552   365 
  553    -1 So far I like many of the ideas of APCA, but I am not convinced that they are a
  554    -1 significant enough improvement to justify breaking backwards compatibility. I
  555    -1 am also concerned by the [lack of publicly available evidence].
   -1   366 For me, (3) and (6) are the big issues. Even if APCA is a good model to
   -1   367 calculate contrast, many of its innovations just don't work in the
   -1   368 context of WCAG where we need to provide clear guidance without knowing
   -1   369 the viewing conditions.
  556   370 
  557    -1 Much of the difference between APCA and WCAG 2 comes down to a different choice
  558    -1 of parameters, and that is ultimately a policy decision. I hope this analysis
  559    -1 can support the community in figuring out what questions need to be answered.
   -1   371 Granted, the WCAG 2.x algorithm also makes assumptions. But instead of
   -1   372 fighting over which parameters are *correct*, we should discuss how we
   -1   373 can deal with that uncertainty.
  560   374 
   -1   375 [issues]: https://www.bounteous.com/insights/2019/03/22/orange-you-accessible-mini-case-study-color-ratio/
  561   376 [Web Content Accessibility Guidelines]: https://www.w3.org/TR/WCAG21/
  562   377 [sRGB color space]: https://en.wikipedia.org/wiki/SRGB
  563    -1 ["gold standard" for text contrast]: https://github.com/w3c/wcag/issues/695#issuecomment-483805436
   -1   378 [Weber-Fechner model]: https://en.wikipedia.org/wiki/Weber%27s_Law
  564   379 [Regarding APCA Exponents]: https://git.apcacontrast.com/documentation/regardingexponents
  565    -1 [Studies have shown]: https://en.wikipedia.org/wiki/Contrast_(vision)#Contrast_sensitivity_and_visual_acuity
   -1   380 [github issue]: https://github.com/w3c/wcag3/issues/226
   -1   381 [Stevens model]: https://en.wikipedia.org/wiki/Stevens's_power_law
   -1   382 [contrast comparison]: plots/contrast_comparison.png
   -1   383 [lightness comparison]: plots/lightness_comparison.png
   -1   384 [CIECAM02]: https://en.wikipedia.org/wiki/CIECAM02
  566   385 [large text]: https://www.w3.org/TR/WCAG21/#dfn-large-scale
  567   386 [1.4.4]: https://www.w3.org/TR/WCAG21/#resize-text
  568   387 [1.4.10]: https://www.w3.org/TR/WCAG21/#reflow
  569   388 [1.4.12]: https://www.w3.org/TR/WCAG21/#text-spacing
  570    -1 [does attempt to model spatial frequency]: https://git.apcacontrast.com/WEBTOOLS/APCA/
  571    -1 [1.4.11]: https://www.w3.org/TR/WCAG21/#non-text-contrast
  572    -1 [lack of publicly available evidence]: https://github.com/w3c/silver/issues/574
   -1   389 [table]: https://git.apcacontrast.com/documentation/README#font-use-lookup-tables
   -1   390 [severe lack of publicly available research]: https://github.com/w3c/silver/issues/574
   -1   391 [list of unrelated links]: https://github.com/w3c/wcag3/issues/29
  573   392 
  574    -1 [^1]: [Monaci, Gianluca & Menegaz, Gloria & Susstrunk, S. & Knoblauch, Kenneth. (2022). Color Contrast Detection in Spatial Chromatic Noise.](https://www.researchgate.net/publication/37435854_Color_Contrast_Detection_in_Spatial_Chromatic_Noise)
  575    -1 [^2]: [Hwang AD, Peli E. (2016). New Contrast Metric for Realistic Display Performance Measure.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489230/)
   -1   393 [^1]: [Hwang AD, Peli E. (2016). New Contrast Metric for Realistic Display Performance Measure.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489230/)
   -1   394 [^2]: [Monaci, Gianluca & Menegaz, Gloria & Susstrunk, S. & Knoblauch, Kenneth. (2022). Color Contrast Detection in Spatial Chromatic Noise.](https://www.researchgate.net/publication/37435854_Color_Contrast_Detection_in_Spatial_Chromatic_Noise)
  576   395 [^3]: [Georgeson, M. A., & Sullivan, G. D. (1975). Contrast constancy: deblurring in human vision by spatial frequency channels.](https://pubmed.ncbi.nlm.nih.gov/1206570/)