What happens when a red blood cell dies?

Mature human red blood cells (erythrocytes) are highly specialized and terminally differentiated cells that lack normal cell organelles such as a nucleus, endoplasmic reticulum and mitochondria. Healthy erythrocytes have a lifespan of about 120 days, after which they are cleared from the circulation. The senescence involved in erythrocyte death and removal is characterized by distinct morphological changes that include cell shrinkage, plasma membrane microvesiculation, progressive shape change, loss of peripheral membrane proteins, and the loss of membrane lipid asymmetry including the externalization of phosphatidylserine, which triggers the removal of erythrocytes by macrophages. Interestingly, these morphological changes are very similar to the morphological hallmarks of programmed cell death. However, as mature erythrocytes lack cellular organelles and because they survive conditions that induce programmed cell death, erythrocytes have been considered to be the only mammalian cells lacking programmed cell death machinery.

Show
    • apaoptosis
    • erythorocyte
    • programmed cell death
    • death effector
    • caspase
    • Cell biology
    • Genetics
    • Molecular Medicine

    To read this article in full you will need to make a payment


    Page 2

    A Journal Club article by Richard W.C. Wong and Siu Yuen Chan entitled ‘Epidermal growth factor receptor: a transcription factor?’, published in the November 2001 issues of TiBS (Trends Biochem. Sci. 26, 645–646), TiG (Trends Genet. 17, 625–626) and TEM (Trends Endocrinol. Metab. 12, 431), contains large tracts of text that have been copied from a previously published Nature Cell Biology News and Views article by Mark Waugh and Justin Hsuan (Nat. Cell Biol. (2001) 3, E209–E211). The authors, TiBS, TiG, TEM and Elsevier Science regret that this has happened and apologize unreservedly to Justin Hsuan and Mark Waugh, to the readers, editorial team and publishers of Nature Cell Biology, and to our own readers.

    To read this article in full you will need to make a payment


    Page 3

    Recent experiments demonstrate the molecular process by which an embryo begins to mark cells for a neural versus an epidermal fate. In the 1950s, the mathematician Alan Turing proposed the ‘morphogen model’ of development whereby a diffusible chemical signal (a ‘morphogen’) triggers certain differentiation processes. The morphogen is produced by one set of cells and removed by an adjacent set, leading to varying concentrations depending on distance from the source. Different morphogen concentrations could determine different cell fates. One potential morphogen is called Short Gastrulation (Sog) in fruit flies. However, until now, no one could demonstrate that Sog was present as a concentration gradient in vivo. A new sensitive fluorescence detection technique made it possible to observe the accumulation of Sog directly and showed that it is indeed graded. The researchers, led by Ethan Bier, also showed that an enzyme called Tolloid, which is produced in the epidermal region and which cleaves and inactivates Sog, might be responsible for this. ‘We observed that in the absence of Tolloid, Sog levels are greatly elevated in epidermal cells and accumulate steadily until the entire embryo contains nearly uniform amounts of Sog,’ said Bier. ‘These studies provide the final link demonstrating that Sog is a morphogen and explain how the graded distribution of Sog is created.’ (Srinivasan, S. [2002] Dev. Cell 2, 91–101) PL

    To read this article in full you will need to make a payment

    In a discovery that appears to turn textbook knowledge on its head, researchers at Massachusetts General Hospital and Harvard Medical School found that damaged or old red blood cells — and the iron they carry — are in fact mainly taken care of by the liver and not, as previously believed, by the spleen.

    The findings, made public in a report titled “On-demand erythrocyte disposal and iron recycling requires transient macrophages in the liver,” and published in the journal Nature Medicine, have a huge impact on research in sickle cell disease and other conditions affecting red blood cells.

    “In addition to identifying the liver as the primary site of these processes, we also identified a transient population of bone-marrow-derived immune cells as the recycling cells,” senior author Filip Swirski, an associate professor of radiology at Harvard, said in a news release.

    When red blood cells die, either by damage or old age (which, in red blood cells, is about 120 days), the iron inside the cells needs to be taken care of and recycled. If iron, bound to hemoglobin, is released uncontrollably — as can be the case when large numbers of cells get damaged — toxicity, kidney damage, and anemia can follow.

    To reach the new conclusions of the final resting place of red blood cells, the research team used several different models, including a mouse model of sickle cell disease. They found that as the number of damaged blood cells started rising in the blood, a particular type of immune cells called monocytes jumped into action.

    Researchers observed monocytes engulfing damaged red blood cells and taking them to both the spleen and the liver. When checking back a couple of hours later, the team was surprised to find nearly all of the damaged cells inside another type of immune cell known to engulf old cells and other cellular debris — macrophages — inside the liver.

    Monocytes are, in fact, the forebearers of macrophages, and the team noted that the liver started producing molecules known as chemokines — factors triggering the movement of cells toward their source — attracting the red blood cell-containing monocytes. Once the cells entered the liver, they turned into macrophages capable of recycling the iron.

    The accumulation of such macrophages in the liver has been noted in sickle cell disease, an observation now explained by the new findings. To double-check their results, the researchers blocked the process and watched the levels of free iron and hemoglobin rise in the bloodstream, leading to liver and kidney damage in the experimental animals.

    “The mechanism we identified could be either helpful or damaging, depending on the conditions. If overactive, it could remove too many red blood cells, but if it’s sluggish or otherwise impaired, it could lead to iron toxicity,” said Swirski, adding that research into the details of the mechanism might teach us how to harness it when needed.

    Red blood cells are the key to life. They are constantly traveling through your body, delivering oxygen and removing waste. If they didn’t do their job, you would slowly die.

    Red blood cells contain a protein called hemoglobin that gives blood its red hue. Hemoglobin contains iron, which makes it an excellent vehicle for transporting oxygen and carbon dioxide.

    As blood passes through the lungs, oxygen molecules attach to the hemoglobin. When the blood passes through the body’s tissue, the hemoglobin releases oxygen to the cells. The empty hemoglobin molecules then bond with the tissue’s carbon dioxide or other waste gasses to transport them away.

    Over time, red blood cells get worn out and eventually die. The average life cycle of a red blood cell is only 120 days. But don’t worry! Your bones are continually producing new blood cells.

    The population of red blood cells (RBCs) in the organism must remain within definite limits in order to ensure the oxygenation of body tissues and to maintain adequate values of blood pressure and viscosity. This is achieved by means of homeostatic mechanisms that control the ratio between cell production and destruction and compensate any unbalance between oxygen supply and demand by increasing or reducing the number of circulating RBCs [1,2].

    The formation of new RBCs is controlled by erythropoietin (Epo), a hormone produced by fibroblasts of peritubular capillaries in the kidney that induces proliferation and differentiation of erythroid precursor cells in the bone marrow [3]. On the other hand, RBCs are removed by macrophages of the mononuclear phagocytic system (MPS) when passing through the splenic and hepatic sinusoids. Macrophages identify and phagocytize RBCs that have attained a critical age (120 days in humans and 60 days in mice) in a process known as erythrophagocytosis [4–6].

    In hypoxia, fibroblasts increase the release of Epo, thus accelerating the production of new cells and boosting the population of RBCs [7,8]. Conversely, if oxygen levels rise above physiological needs (e.g. in acclimation to higher partial pressure of oxygen after descent to sea level from high altitudes), fibroblasts lower the production of Epo and the population of RBCs shrinks to a new equilibrium size [3,9,10]. Excess of oxygen supply also entails an increase in the rate of cell destruction caused by neocytolysis, a homeostatic mechanism that entails the selective removal of RBCs of only 10 or 11 days of age, and contributes to the rapid reduction of the number of cells [11–14].

    The switch from 120 days to 11 days of duration in response to environmental factors indicates that lifespan is not a fixed, intrinsic feature of RBCs. This point is further evidenced by the fact that RBCs live around 40 days less in newborn humans than in adults [15]. Even if the mechanisms that regulate changes in RBC lifespan remain obscure, it is widely assumed that RBC ageing and death are ultimately caused by oxidative stress (OS) [16–18]. The continuous exposure to highly reactive oxygen radicals deteriorates the membrane and cytoplasm of the RBC, which may eventually compromise its function [19]. In fact, higher sensitivities to OS correlate with shorter lifespans [20]. This observation has been interpreted as evidence of an active mechanism that would set RBC lifespan by fine-tuning the expression of genes that confer resistance to OS in erythroid precursors [16,20]. From this approach, human RBCs would be genetically configured to show signs of OS-driven senescence around the age of 120 days. Macrophages of the MPS would then identify aged RBCs by means of these signs [21].

    In our opinion, the explanation of RBC lifespan as determined exclusively by OS is incomplete. For one thing, not all aged RBCs show the typical signs of severe OS-derived damage, such as cell shrinkage and membrane blebbing [22,23]. As a matter of fact, defective RBCs of any age are not destroyed by erythrophagocytosis, but through an alternative mechanism known as eryptosis [24,25]. This suggests that normal and damaged RBCs follow different phagocytosis pathways. On the other hand, the 10-fold decrease in RBC lifespan during neocytolysis would require a substantial reduction in the resistance of RBCs to oxidative damage. This would multiply the risk of RBC malfunction, a feature that seems unlikely for a physiological homeostatic mechanism. Alternatively, neocytolysis and erythrophagocytosis might be driven by different mechanisms [3,11], implying that some aspects of RBC lifespan cannot be explained by OS alone.

    We postulate in this work that OS should not be considered as the key determinant of RBC lifespan, even if it causes the destruction of a fraction of circulating cells, and certainly imposes an upper boundary to the potential duration of RBCs in the blood. We suggest that lifespan is set by means of a molecular algorithm that controls cell-to-cell interactions between RBCs and macrophages of the MPS. We will show that such an algorithm could allow to fine-tune RBC lifespan in a variety of ways, thus providing a flexible system to adapt the number of cells to the demand of oxygen in the tissues.

    The view of RBC lifespan introduced here frames a theoretical foundation in which to integrate different observations regarding RBC biology, such as erythrophagocytosis, neocytolysis and the seemingly paradoxical presence of auto-antibodies against host RBCs in the organism. In particular, we will show that these phenomena emerge as alternative outcomes of the same mechanisms working under different conditions of oxygen availability.

    The phagocytosis of RBCs by macrophages of the MPS is known to be mediated by phosphatidylserine (PS) and CD47 [26–31]. PS and CD47 have been labelled as ‘eat-me’ and ‘don’t-eat-me’ signals, owing to their pro- and anti-phagocytic effects, respectively [32,33]. Available empirical evidence concerning the dynamics of PS and CD47 expression in the membrane of RBCs can be summarized as follows:

    • (E1) PS is confined to the inner layer of the cell membrane in newly formed RBCs, so it is invisible for macrophages. Such membrane asymmetry is progressively lost in ageing RBCs, which increases PS exposure in the cell surface [34–36]. Therefore, pro-phagocytic effect of PS intensifies with the age of the RBC (figure 1a).

    • (E2) Conversely, the anti-phagocytic activity of CD47 is higher at the birth of the RBC [37,38]. Progressively lower expression of the protein or conformational changes in its spatial structure diminish its activity as a phagocytosis inhibitor as the cell ages [39] (figure 1a).

    • (E3) The effects of PS and CD47 cancel out each other [40], so that the net balance between PS and CD47 in an RBC determines whether or not it is destroyed by macrophages [22]. From points E1 and E2, it follows that ‘don’t-eat-me’ signals offset ‘eat-me’ signals in the membrane of young RBCs, preventing their phagocytosis. The difference between ‘eat-me’ and ‘don’t-eat-me’ signals grows in ageing RBCs until it reaches a critical threshold that elicits their destruction by macrophages of the MPS [40,41] (figure 1b).

    • (E4) It has also been observed that RBCs with sufficiently low levels of CD47 are also phagocytized regardless of the amount of PS present in their surface [42,43]. In this case, young RBCs are not destroyed because of the anti-phagocytic effect of CD47 (evidence E2). Owing to progressive loss of CD47 activity in ageing RBCs ‘don’t-eat-me’ signals eventually fall below a certain level that prompts the phagocytosis of the cell (figure 1c).

    What happens when a red blood cell dies?

    Figure 1. Rationale of the conceptual model of RBC lifespan determination. (a) Time evolution of membrane signals in a RBC according to empirical evidence (see points E1 and E2). (b) A RBC is phagocytized when the difference between ‘eat-me’ and ‘don’t-eat-me’ signals in its membrane attains a critical threshold (evidence E3). (c) An RBC can also be phagocytized if its level of ‘don’t-eat-me’ signals falls below a critical threshold (E4). (d) The conditions triggering RBC phagocytosis are mutually exclusive. In this example, the phagocytosis of the RBC occurs because the expression of ‘don’t-eat-me’ signals falls below a critical threshold (condition E4). (e,f) Different dynamics of membrane signals result in different lifespans (e) or in the RBC being phagocytized because it fulfils condition E3 before condition E4 (f).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The conditions that trigger RBC phagocytosis (points E3 and E4) seem to be simultaneously fulfilled by ageing RBCs. However, since any particular RBC can only be phagocytized once, both conditions are in fact mutually exclusive. Only the first of the thresholds to be reached determines the lifespan of the RBC (figure 1d–f). On the other hand, both conditions seem to accomplish the same purpose, since both foster the phagocytosis of aged RBCs and the survival of young cells. This raises the question of why two apparently redundant pathways of RBC removal exist.

    In order to address this issue, we begin by remarking that CD47 and PS also play a major role in the control of the phagocytosis of other cell types by macrophages [44–46]. Specifically, CD47 is broadly expressed in the host and absent in foreign cells [29,47], while PS is confined to the membrane of apoptotic host cells [48,40]. These patterns of PS and CD47 expression allow for macrophages to identify CD47+ cells as self-structures [31,42]. In this case, accompanying high levels of PS are recognized as a mark of apoptosis, which triggers the phagocytosis of the cell and the release of anti-inflammatory signals that avoid autoimmunity against healthy tissues [49,50]. On the other hand, the absence of CD47 in a cell membrane reveals the presence of a potential infection [51,52]. Unlike the silent clearance of apoptotic host cells, phagocytosis of CD47− cells is followed by the activation of the macrophage [53], and the secretion of pro-inflammatory signals that may lead to an innate immune response [54,55].

    We postulate that the role of PS and CD47 in the phagocytosis of RBCs (as described in points E3 and E4) follows this general pattern. Young RBCs, like non-apoptotic host cells show high levels of CD47 and low levels of PS, which prevents their phagocytosis by macrophages. Among aged RBCs, those with high PS and low CD47 expression are comparable to apoptotic host cells, while those expressing very low levels of CD47 can be likened to foreign cells. Bearing these analogies in mind, we hypothesize the existence of two alternative pathways of RBC phagocytosis that entail different macrophage reactions. Specifically, we suggest that the pathway controlled by the balance between PS and CD47 (E3) is similar to the removal of apoptotic host cells. In particular, it does not trigger any immune response. By contrast, the phagocytosis of RBCs with very low CD47 expression (E4) might be analogous to the destruction of non-self agents by macrophages, and could provoke autoimmune reactions against host RBCs. In the remainder of this article, we will refer to both phagocytosis pathways as silent and immune, respectively.

    The existence of an ad hoc mechanism to provoke autoimmunity may seem paradoxical. However, it has long been observed that auto-antibodies targeting host RBCs are usually present in the organism [56–58]. Anti-RBC antibodies are natural antibodies produced by B-1 cells [59,58]. Unlike antibodies from other B cell subsets, B-1 antibodies have anti-inflammatory effects, which minimize potential collateral damage to host tissues [60,61]. This might explain why anti-RBC auto-antibodies are usually innocuous [62] and only occasionally cause clinical disorders, known under the general term of autoimmune haemolytic anaemia [62]. On the other hand, natural antibodies are spontaneously produced in the absence of foreign antigens [56] so their specificity for RBCs cannot be explained as due to cross-reactivity of RBC epitopes and non-self structures encountered in previous infections. This raises the question of how these auto-antibodies are produced.

    Our assumption of an immune pathway of RBC phagocytosis suggests a possible answer for this question. Macrophages of the MPS express MHC molecules, and can therefore act as antigen presenting cells [63,64]. We suggest that after phagocytizing RBCs with very low CD47 expression they would initiate an adaptive immune response, much like they do after phagocytizing foreign cells. As a matter of fact, it has been recently observed that removal of CD47 from self-RBCs suffices indeed to trigger immune responses in mice [65]. Nevertheless, since RBCs are not pathogens, macrophages of the MPS would recruit B-1 cells instead of more aggressive B cell types, leading to the production of non-inflammatory antibodies against host RBCs. The functional role of these anti-RBC auto-antibodies remains to be explained. In this respect, we will show in the following sections that anti-RBC autoimmunity, together with erythrophagocytosis and neocytolysis fit into a global, coherent model of RBC homeostasis. In order to do that, we will next state the previous conceptual model in mathematical terms.

    As we have discussed above, quantitative dynamics of PS and CD47 seem to determine RBC lifespan (figure 1d–f). However, to the best of our knowledge, no such quantitative analysis is currently available in the literature. In the absence of empirical data, we will formulate a mathematical model that reproduces the qualitative features of PS and CD47 dynamics outlined in the previous section (see points E1 to E4). This model is based on the following assumptions:

    • (A1) The expression of ‘eat-me’ signals in the outer membrane of the RBC increases at a constant rate β.

    • (A2) The number of ‘don’t-eat-me’ signals decreases at a constant rate α. This results in an exponential decay, a behaviour that has been described for other RBC membrane proteins (e.g. [66,67]).

    • (A3) Two independent thresholds exist, denoted by Ts and Ti, that trigger silent and immune phagocytosis pathways, respectively.

    Assumptions A1 and A2 do not intend to account for the molecular mechanisms underlying the time evolution of membrane signals. Instead, they have been chosen for the sake of simplicity in order to show the relevance of signal dynamics in RBC homeostasis. Nevertheless, new data about PS and CD47 dynamics could be easily included in this approach by modifying assumptions A1 and A2. We will discuss the implications of this particular choice of assumptions in the last section of this article.

    Denoting by E(t) and D(t) the number of ‘eat-me’ and ‘don’t-eat-me’ signals at time t, respectively, assumptions A1 and A2 can be stated in mathematical terms as follows:

    D′(t)=−αD(t)andE′(t)=β,}2.1

    where α and β are positive parameters.

    Integrating equations (2.1) we get an explicit expression for the dynamics of ‘eat-me’ and ‘don’t-eat-me’ signals:

    D(t)=D0 e−αtandE(t)=E0+βt,}2.2

    where E0 and D0 are the amounts of ‘eat-me’ and ‘don’t-eat-me’ signals in the cell membrane at the birth of the RBC, respectively.

    From assumption A3, it follows that conditions E(ts)−D(ts)=Ts and D(ti)=Ti define the times ts and ti at which the RBC is removed through the silent and immune phagocytosis pathways, respectively. Introducing the expressions of E(t) and D(t) given by equations (2.2) in these conditions, we get the following values for ti and ts:

    ti=1αln(D0Ti)andts=1β(Ts−E0)+1αW(αβD0 eα(E0−Ts)/β),}2.3

    where W(⋅) is the product logarithm or Lambert W function.

    Equations (2.3) define the conditions that dictate the fate of RBC. The cell is cleared through the silent pathway if ts<ti and through the immune pathway otherwise. From equations (2.3), it follows that the timing of RBC phagocytosis, and hence its lifespan, is given by L=min(ti,ts).

    From equations (2.1) to (2.3), ‘eat-me’ and ‘don’t-eat-me’ signals can be viewed as defining a cellular algorithm whose execution in the membrane of each RBC determines both its fate (i.e. if it is removed through the silent or the immune pathway) and its lifespan. In this section, we will show that this algorithm provides a coherent, integrative view of RBC homeostasis. According to equations (2.3), RBC fate and lifespan are unambiguously defined by the specific values of six parameters. Roughly speaking, these parameters represent the amount of membrane signals at the birth of the cell (D0 and E0), the rates of change of these signals (α and β), and the thresholds that elicit the phagocytosis of RBCs (Ts and Ti). Variations in any of these features result in changes in either the lifespan of the cell or the phagocytosis pathway leading to its destruction (figure 2). Bearing this fact in mind, we will next enumerate a series of biological mechanisms that could be used by the organism to modulate RBC lifespan, and discuss their consequences on RBC homeostasis.

    What happens when a red blood cell dies?

    Figure 2. Results of the mathematical model of RBC lifespan determination. (a) The dynamics of membrane signals as defined by equations (2.1)–(2.3) satisfy the qualitative constraints imposed by empirical evidence (E1–E4). Both the lifespan of the cell and how it is phagocytized (i.e. through the silent or the immune pathway) depend on the particular values of the model parameters. In this case, the difference between ‘eat-me’ and ‘don’t-eat-me’ signals is the first to reach its critical threshold (Ts), so that this cell is destroyed through the silent pathway at time ts (which sets its lifespan). (b) Changing the silent threshold (parameter Ts in the model) shortens the lifespan of the cell, but not the phagocytosis pathway. (c) By contrast, lower CD47 expression at the birth of the cell (parameter D0) both shortens the lifespan of the cell and changes the condition that triggers its phagocytosis (from silent to immune).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    As we noted above, OS causes the accumulation of defects in the cytosol and membrane of RBCs, increasing the probability of malfunction and even of cell lysis in the blood. In extreme cases, this may induce a severe clinical condition known as haemolysis [68]. RBCs showing signs of oxidative damage should therefore be removed from the circulation in order to minimize the risk of haemolysis. It has been suggested that the level of PS expression is one of those signs, since higher levels of OS are accompanied by higher rates of PS externalization [34,35,69,70].

    PS exposure in response to OS is not a passive process. Instead, it seems to be mediated by cytoplasmic RBC proteins [27], suggesting that RBCs are able to accelerate the rate of PS externalization in case of oxidative damage. From this observation, we can deduce that higher values of parameter β (rate of PS externalization) correspond to RBCs exposed to higher levels of OS (see equations (2.1)). In agreement with empirical observations, this condition shortens RBC lifespan [20,70] (figure 3). From the perspective of our model, accelerated PS exposure in response to OS can be interpreted as an active mechanism to minimize the risk of RBC lysis in the blood. By increasing the rate of PS translocation, an RBC would hasten its phagocytosis through the silent pathway. The cell would therefore be removed from the circulation before attaining a critical level of oxidative damage that might compromise its function or even its physical viability.

    What happens when a red blood cell dies?

    Figure 3. Potential mechanisms of RBC lifespan modulation. (a) Higher levels of oxidative stress are associated with higher rates of PS externalization. In agreement with empirical data, the model predicts an inverse correlation between the degree of OS and RBC lifespan. If the rate of PS externalization is above a critical value (β*) the curve of ts (time to reach the silent threshold) is below the curve of ti (time to attain the immune threshold). This implies that for high values of OS (β>β*) RBCs are phagocytized through the silent pathway. Only if β<β* are RBCs destroyed through the immune pathway, which can lead to anti-RBC autoimmunity. (b) Time evolution of the difference between ‘eat-me’ and ‘don’t-eat-me’ signals in the membrane of an RBC formed at time t1. The difference between membrane signals should reach the silent threshold at time t2, thereby causing the phagocytosis of the cell. Increasing the silent threshold delays the phagocytosis of the RBC until t3, thus extending its lifespan. (c) Neocytolysis. The figure shows the dynamics of the difference between ‘eat-me’ and ‘don’t-eat-me’ signals in the membrane of two RBCs that differ in the expression of membrane signals at birth. The first cell, formed at time t1, is phagocytized at time t4 after a normal lifespan. The second cell, born at time t2>t1 with a larger difference between ‘eat-me’ and ‘don’t-eat-me’ signals in its membrane attains the silent threshold much faster, so it is destroyed at time t3, before the first cell and after a much shorter lifespan. (d) According to our model, the lifespan of each RBC is directly correlated with the level of CD47 expressed in its membrane when it is formed. Low values of CD47 expression could explain short lifespans observed during neocytolysis. Furthermore, if the initial amount of CD47 falls below a critical level (the autoimmunity threshold), the immune phagocytosis occurs before the silent pathway (ti<ts). In this case, macrophages of the MPS phagocytize RBCs after very short lifespans and initiate anti-RBC autoimmune responses.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    According to equations (2.3), tuning the silent phagocytosis threshold provides another mechanism to modulate RBC lifespan. This threshold is defined as the difference between ‘eat-me’ and ‘don’t-eat-me’ signals that triggers the silent phagocytosis pathway in macrophages of the MPS. Hence, from a mechanistic point of view, tuning this parameter amounts to modulating the sensitivity of macrophages to RBC signals. Increasing the silent phagocytosis threshold delays phagocytosis and extends RBC lifespan (equations (2.1) and figure 3b). Each day added to mean RBC lifespan prevents the destruction of 1011 cells (around 1% of the total population), which is equivalent to the daily production of RBCs in normal conditions.

    A significant fall in the number of RBCs after a haemorrhage may produce a deficit of oxygen in the tissues. The subsequent rise in the levels of Epo in the blood [71] eventually restores the population of RBCs and the equilibrium of oxygen. However, given that this process involves the differentiation of precursor cells it can take a few days to take the population back to its original size. Increasing the silent threshold could buffer cell loss and help to maintain the supply of oxygen until Epo-mediated recovery of the population is completed. In this regard, we remark that empirical evidence suggests that phagocytosis of macrophages of the MPS is indeed suppressed after haemorrhages [72]. Furthermore, macrophages are equipped with Epo receptors [73], implying that the tuning of the silent phagocytosis threshold might be directly controlled by the levels of plasma Epo. Once the equilibrium of oxygen is recovered, the levels of Epo would return to normal values, restoring both the silent threshold and RBC lifespan.

    Neocytolysis and erythrophagocytosis are currently considered as alternative mechanisms of RBC removal [3,13]. In particular, it is implicitly assumed that erythrophagocytosis is the default pathway of destruction of senescent RBCs during normal homeostasis, while neocytolysis is somehow triggered by decreased levels of Epo [9,11]. Such drops of Epo occur, in particular, whenever oxygen availability in the tissues is above physiological needs. For instance, people descending to sea level after a period of acclimation to high altitudes move from lower to higher partial pressures of atmospheric oxygen. In this situation, the population of RBCs is larger than needed to ensure the supply of oxygen to the tissues, and contracts to a new equilibrium size through the selective destruction of younger RBCs [74]. The mechanisms underlying the switch from erythrophagocytosis to neocytolysis remain poorly understood [9,70].

    In this work, we suggest that neocytolysis and erythrophagocytosis should not be considered as independent mechanisms, but as alternative outcomes of the algorithm of RBC lifespan determination. Specifically, both processes can be explained as caused by different patterns of PS and CD47 expression in the membrane of newly formed RBCs. Figure 3c compares the lifespan of two RBCs that differ in the amount of membrane signals at birth. The RBC with the bigger difference between PS and CD47 expression is the first one to reach the silent phagocytosis threshold, even if it is born later. Moreover, this cell is destroyed after a short lifespan, while the other is spared and will only be removed after reaching the usual RBC lifespan. These features are precisely what defines neocytolysis. Therefore, according to our model, neocytolysis occurs if RBCs formed under lower levels of Epo are born with more PS or less CD47 in their outer membrane. Empirical evidence points to the latter, since young RBCs show lower levels of CD47 and similar levels of PS (when compared with older cells) in people descending to sea level after acclimation to high altitude [12].

    This result suggests that the transition from erythrophagocytosis to neocytolysis does not require a switch between alternative mechanisms of RBC destruction. Instead, the lifespan of RBCs can vary in a continuum that ranges from 10 days during neocytolysis, to 80 days in newborns and 120 days in adult humans, depending on the level of PS and/or CD47 at the birth of the cells. In order to illustrate the main point of this work, and for the sake of simplicity, we will continue our discussion assuming that Epo only affects CD47 expression in newly formed RBCs (figure 3d). Similar arguments could be drawn if Epo also determined initial PS levels.

    Neocytolysis reduces the number of RBCs when oxygen supply exceeds the demands of body tissues [9,11]. We suggest that autoimmunity could provide a complementary mechanism to accelerate the contraction of the RBC population in such circumstances. According to the model, anti-RBC autoimmune responses emerge from the same process that leads to neocytolysis, namely, the reduction in CD47 expression in newly formed RBCs. If the population of RBCs is still larger than required after neocytolysis-driven contraction, the levels of Epo continue to drop. In consequence, newly formed RBCs express progressively less CD47 in their membranes (figure 3d). RBCs whose initial levels of CD47 expression falls beyond a critical point (labelled as the autoimmune threshold) are phagocytized through the immune pathway (figure 3d). The ensuing production of natural auto-antibodies would foster the death of other RBCs, further contracting the population.

    The view of anti-RBC autoimmune responses as a homeostatic mechanism is supported by the fact that natural antibodies do not target all circulating RBCs, which might result in a massive and uncontrolled loss of cells. Instead, they are directed against specific epitopes usually expressed in aged RBCs and absent in young cells [57]. Moreover, RBCs of any age are also protected from the action of auto-antibodies by CD47, which is known to inhibit the phagocytosis of opsonized cells [75,28]. On the other hand, the protection provided by CD47 is dose-dependent [75], implying that the destruction of an individual RBC through this antibody-mediated pathway depends on both its levels of CD47 and the concentration of antibodies present in the blood. For this reason, only those RBCs with high CD47 expression survive in the course of more aggressive responses. Therefore, the intensity of the autoimmune response (i.e. the amount of auto-antibodies produced) determines the cohorts of RBC that are destroyed, and hence the extent of the reduction in the number of cells.

    In normal conditions, autoimmunity-driven contraction of the population should eventually restore physiological levels of oxygen. Under the assumptions of our model, the subsequent rise in Epo would increase CD47 expression in new RBCs, arresting the production of anti-RBC antibodies (figure 3d). Further increases of initial CD47 would also interrupt neocytolysis and restore RBC lifespan to normal values observed in erythrophagocytosis. Therefore, Epo-dependent regulation of CD47 in new RBCs creates a switch between silent and immune phagocytosis and makes both neocytolysis and homeostatic autoimmunity reversible processes.

    Our model also suggests that anti-RBC responses are only triggered if levels of OS are sufficiently low (figure 3a). Assuming the homeostatic nature of autoimmunity, this result can be understood as preventing the production of auto-antibodies in conditions of severe OS. Under these circumstances, oxidative damage can cause the abnormal destruction of many RBCs, making unlikely the need for anti-RBCs antibodies to remove an excess of cells.

    The role of Epo in RBC production and its relationship to oxygen homeostasis are well established in the literature [3]. It has been hypothesized that Epo could also control the onset of neocytolysis by modulating the interaction between macrophages of the MPS and young circulating RBCs [11]. The theoretical model presented in this work supports this hypothesis by suggesting an explicit mechanism that links Epo to RBC lifespan determination. Moreover, this model suggests that neocytolysis can be understood as a particular manifestation of a more general function of Epo as determinant of RBC destruction. This function would consist in setting RBC lifespan by adjusting the phagocytosis thresholds and the levels of CD47 expression in newly formed cells. If proven correct, this model would explain a variety of RBC responses to changes in oxygen supply to the tissues.

    For instance, people descending to sea level after high-altitude acclimation show sharp fluctuations of Epo owing to altitude-related changes in the partial pressure of oxygen. Epo increases during acclimation to higher altitudes, and falls after returning to sea levels, attaining lower values than those found before altitude acclimation [12,70] (figure 4a). Similar Epo dynamics have been described in malaria patients. The destruction of RBCs by Plasmodium parasites during the first stages of malaria causes a deficit of oxygen in the tissues, and a subsequent increase of Epo [76,77]. By contrast, later stages of the infection are usually associated with insufficient production of Epo [77–79]. Epo also falls if partial pressure of oxygen increases, e.g. during spaceflights or in the return to sea level after high-altitude acclimation [80,9].

    What happens when a red blood cell dies?

    Figure 4. A theoretical model for the relationship between RBC lifespan and oxygen homeostasis. (a) Acclimation to environments with different partial pressures of oxygen, or clinical conditions that involve massive RBC loss such as malaria entail sharp fluctuations in the levels of plasma Epo (see text for references). (b) We hypothesize that Epo controls CD47 expression in newly formed RBCs, which in turn sets their expected lifespan (see equations (2.3)). In normal conditions both Epo and oxygen levels are at equilibrium, and mean RBC lifespan is around 120 days (0). Any variation in Epo, independently of its cause, changes the amount of CD47 in newly formed RBCs and hence its lifespan. From this perspective, a pronounced decrease in Epo suffices to account for the onset of neocytolysis observed in people returning to sea level after high-altitude acclimation or in malaria patients (labelled as −1 in the figure). Further drops of Epo can lead to autoimmunity (labelled as −2), which could explain the presence of auto-antibodies against host RBCs in malaria patients or in astronauts after space flights. See the text for further details.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We postulate that any drop in Epo is expected to exert similar effects on RBC lifespan independently of its cause. Within the framework of our model, these effects range from neocytolysis to the initiation of homeostatic autoimmunity (figure 4b). As a matter of fact, both neocytolysis and strong anti-RBC responses have been described in astronauts after space flights [9,74]. As for malaria infections, both Plasmodium falciparum and P. vivax infections cause the abnormal removal of an important number of non-parasitized cells (npRBCs) [81,82]. In some patients of severe malaria-derived anaemia, the destruction of npRBCs can even continue long after the infection has been cleared (see [83] and references therein). Therefore, malarial anaemia cannot be explained simply by the direct destruction of infected RBCs alone. Both selective death of young non-parasitized RBCs [84–86] and the presence of anti-RBC antibodies [87] suggest that neocytolysis and homeostatic autoimmunity might play a major role in the development of anaemia during malaria. In this situation, the anomalous drop of Epo that characterizes latter stages of malaria infection would be erroneously perceived by the organism as caused by an excess of circulating RBCs. The ensuing normal homeostatic mechanisms (neocytolysis and homeostatic autoimmunity) triggered in abnormal conditions of oxygen availability would lead to anomalous reductions in the population of RBCs.

    The consumption of oxygen by the organism is highly variable owing to factors such as circadian metabolic rhythms, the intensity of physical activity or even fluctuations in ambient temperature [88,89]. In consequence, homeostatic mechanisms must continuously adjust the balance between RBC production and destruction to maintain an appropriate number of RBCs. The control of RBC production by Epo is well described in the literature [3]. By contrast, many questions about RBC destruction remain largely unanswered. In particular, no universally accepted explanation of the mechanisms underlying changes in RBC lifespan is available as yet.

    A substantial body of evidence points to PS and CD47 as key determinants of RBC phagocytosis [26–31]. In this work, we postulate that quantitate aspects of these dynamics explain how RBC lifespan variations are related to oxygen homeostasis. This statement is based on two main assumptions. First, that the pattern of PS and CD47 expression changes during the life of the cell, as evidenced by differences between young and aged RBCs [34–38]. Second, that the conditions that trigger RBC phagocytosis as described in the literature (see points E3 and E4 above) differ in the subsequent behaviour they elicit on macrophages of the MPS. Specifically, we postulate that the phagocytosis of RBCs with very low levels of CD47 provokes immune responses against host RBCs.

    The nature of this work is necessarily speculative owing to the lack of published data about the actual dynamics of CD47 and PS in the membrane of RBCs. We have modelled plausible dynamics that satisfy the constraints imposed by available evidence. The exponential decay proposed for CD47 has actually been described for other molecules present in RBCs [66,67]. As for the increase of PS externalization observed in ageing cells, we have assumed that it occurs at a constant rate for the sake of simplicity. A different mathematical formalization of the model would involve a different set of parameters, suggesting perhaps other mechanisms of RBC lifespan modulation. In any case, the conceptual model that emerges from published evidence (outlined in figure 1) is independent of any particular mathematical formulation. From this conceptual model, PS and CD47 constitute a molecular clock that sets the timing of RBC phagocytosis. RBC lifespan should be determined by the time it takes for these signals to satisfy one of the two conditions that trigger the phagocytosis of the cell.

    A mathematical version of this conceptual model suggests several mechanisms that might modulate RBC lifespan. First, changes in CD47 expression in newly formed RBCs could account for differences in lifespan observed in erythrophagocytosis and neocytolysis, as well as for the origin and function of anti-RBC autoimmunity. We remark that none of these processes is explicitly implemented in the equations of the model. Instead, they emerge as alternative outcomes of the same algorithm of lifespan determination for different values of initial CD47 expression at the birth of the cell. Second, by controlling macrophage phagocytic activity, Epo levels might continuously adjust the value of the phagocytosis thresholds, thus fine-tuning the lifespan of circulating RBCs. Finally, higher levels of OS might shorten RBC lifespan by accelerating the rate of PS exposure in the outer membrane of the cell. These mechanisms are independent and might be acting simultaneously to determine RBC lifespan. In this respect, it has been recently suggested that hypoxia-induced factors (HIFs) might be involved in shortening RBC lifespan during neocytolysis [70]. The effect of HIF would be related to lower catalase activity in young RBCs formed in hypoxia. Under this assumption, such young RBCs would be more susceptible to OS in case of a rise in oxygen availability, which would translate into higher rates of PS externalization. From the perspective of our model, this would imply that parameter β takes higher values in RBCs formed during hypoxia. At the same time, the amount of CD47 in new RBCs could be modulated by Epo depending on the levels of oxygen. The combined effects of accelerated PS expression and lower CD47 expression would result in shortened RBC lifespan and, eventually, in the production of anti-RBC auto-antibodies. In turn, autoimmunity and neocytolysis would rapidly contract the population whenever oxygen supply is above physiological needs.

    Further research is needed to unveil all the mechanisms underlying RBC lifespan determination. However, irrespectively of their ultimate causes, variations in RBC lifespan play a central role in the ability of the organism to modulate the rate of RBC destruction. Specifically, if all human RBCs lived 120 days, then the temporal pattern of cell destruction would just reproduce the pattern of formation of new RBCs with a delay of 120 days. Extending mean lifespan beyond 120 days lowers the rate of cell destruction and enlarges the number of RBCs in the blood. Conversely, the phagocytosis of RBCs under 120 days of age contracts the population by increasing the rate of cell destruction. Therefore, it is clear that any theory intending to explain RBC homeostasis should explicitly address the question of how RBC lifespan is determined. The conceptual model introduced in this work constitutes a first step towards the development of such a theory. We believe that this model will improve our understanding of how RBC homeostasis is maintained in normal circumstances and how its imbalance can lead to pathology.

    Both authors conceived this work, collaborated in finding and reviewing available literature on this field, contributed equally to the development of the theoretical model presented in this work, collaborated in writing the manuscript and gave final approval for publication.

    The authors declare that they have no competing interests.

    The authors have not received any particular financial support for this work.

    The authors are grateful to F. J. Acosta for helpful comments on the manuscript.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Wynn TA, Chawla A, Pollard JW. 2013Macrophage biology in development, homeostasis and disease. Nature 496, 445–455. (doi:10.1038/nature12034) Crossref, PubMed, ISI, Google Scholar

    • 2

      Lang E, Qadri SM, Lang F. 2012Killing me softly—suicidal erythrocyte death. Int. J. Biochem. Cell Biol. 44, 1236–1243. (doi:10.1016/j.biocel.2012.04.019) Crossref, PubMed, ISI, Google Scholar

    • 3

      Trial J, Rice L. 2004Erythropoietin withdrawal leads to the destruction of young red cells at the endothelial-macrophage interface. Curr. Pharmaceut. Des. 10, 183–190. (doi:10.2174/1381612043453423) Crossref, PubMed, ISI, Google Scholar

    • 4

      Goodman JW, Smith LH. 1961Erythrocyte life span in normal mice and in radiation bone marrow chimeras. Am. J. Physiol. 200, 764–770. Crossref, PubMed, ISI, Google Scholar

    • 5

      Horký J, Vácha J, Znojil V. 1977Comparison of life span of erythrocytes in some inbred strains of mouse using 14C-labelled glycine. Physiol. Bohemoslovaca 27, 209–217. Google Scholar

    • 6

      Piomelli S, Seaman C. 1993Mechanism of red blood cell aging: relationship of cell density and cell age. Am. J. Hematol. 42, 46–52. (doi:10.1002/ajh.2830420110) Crossref, PubMed, ISI, Google Scholar

    • 7

      Jelkmann W. 2011Regulation of erythropoietin production. J. Physiol. 589, 1251–1258. (doi:10.1113/jphysiol.2010.195057) Crossref, PubMed, ISI, Google Scholar

    • 8

      Berglund B. 2012High-altitude training. Sports Med. 14, 289–303. (doi:10.2165/00007256-199214050-00002) Crossref, ISI, Google Scholar

    • 9

      Risso A, Ciana A, Achilli C, Antonutto G, Minetti G. 2007Neocytolysis: none, one or many? A reappraisal and future perspectives. In Regulation of red cell life-span, erythropoiesis, senescence and clearance (eds A Bogdanova, L Kaestner), p. 90. Lausanne, Switzerland: Frontiers E-books. Google Scholar

    • 10

      Gunga H-C, Weller von Ahlefeld V, Appell Coriolano H-J, Werner A, Hoffmann U. 2016Red blood cells in space. In Cardiovascular system, red blood cells, and oxygen transport in microgravity (eds H-C Gunga, V Weller von Ahlefeld, H-J Appell Coriolano, A Werner, U Hoffmann). pp. 35–55. Cham, Switzerland: Springer International Publishing. Google Scholar

    • 11

      Rice L, Alfrey CP. 2005The negative regulation of red cell mass by neocytolysis: physiologic and pathophysiologic manifestations. Cell. Physiol. Biochem. 15, 245–250. (doi:10.1159/000087234) Crossref, PubMed, Google Scholar

    • 12

      Risso A, Turello M, Biffoni F, Antonutto G. 2007Red blood cell senescence and neocytolysis in humans after high altitude acclimatization. Blood Cells, Mol. Dis. 38, 83–92. (doi:10.1016/j.bcmd.2006.10.161) Crossref, PubMed, ISI, Google Scholar

    • 13

      Trial J, Rice L, Alfrey CP. 2001Erythropoietin withdrawal alters interactions between young red blood cells, splenic endothelial cells, and macrophages. J. Invest. Med. 49, 335–345. (doi:10.2310/6650.2001.33899) Crossref, PubMed, ISI, Google Scholar

    • 14

      Divoky V, Song J, Horvathova M, Kralova B, Votavova H, Prchal JT, Yoon D. 2016Delayed hemoglobin switching and perinatal neocytolysis in mice with gain-of-function erythropoietin receptor. J. Mol. Med. 94, 597–608. (doi:10.1007/s00109-015-1375-y) Crossref, PubMed, ISI, Google Scholar

    • 15

      Pearson HA. 1967Life-span of the fetal red blood cell. J. Pediatr. 70, 166–171. (doi:10.1016/S0022-3476(67)80410-4) Crossref, PubMed, ISI, Google Scholar

    • 16

      Hattangadi SM, Lodish HF. 2007Regulation of erythrocyte lifespan: do reactive oxygen species set the clock?J. Clin. Invest. 117, 2075–2077. (doi:10.1172/JCI32559) Crossref, PubMed, ISI, Google Scholar

    • 17

      Rifkind JM, Nagababu E. 2013Hemoglobin redox reactions and red blood cell aging. Antioxid. Redox Signal. 18, 2274–2283. (doi:10.1089/ars.2012.4867) Crossref, PubMed, ISI, Google Scholar

    • 18

      Mohanty J, Nagababu E, Rifkind J. 2014Red blood cell oxidative stress impairs oxygen delivery and induces red blood cell aging. Front. Physiol. 5, 84. (doi:10.3389/fphys.2014.00084) Crossref, PubMed, ISI, Google Scholar

    • 19

      Edwards CJ, Fuller J. 1996Oxidative stress in erythrocytes. Comparative Haematology International 6, 24–31. (doi:10.1007/BF00368098) Crossref, Google Scholar

    • 20

      Marinkovic D, Zhang X, Yalcin S, Luciano JP, Brugnara C, Huber T, Ghaffari S. 2007Foxo3 is required for the regulation of oxidative stress in erythropoiesis. J. Clin. Invest. 117, 2133–2144. (doi:10.1172/JCI31807) Crossref, PubMed, ISI, Google Scholar

    • 21

      de Back D, Kostova E, van Kraaij M, van den Berg T, Van Bruggen R. 2014Of macrophages and red blood cells; a complex love story. Front. Physiol. 5, 9. (doi:10.3389/fphys.2014.00009) Crossref, PubMed, ISI, Google Scholar

    • 22

      Ganz T. 2012Macrophages and systemic iron homeostasis. J. Innate Immunity 4, 446–453. (doi:10.1159/000336423) Crossref, PubMed, ISI, Google Scholar

    • 23

      Lang F, Lang E, Föller M. 2012Physiology and pathophysiology of eryptosis. Transf. Med. Hemother. 39, 308–314. (doi:10.1159/000342534) Crossref, PubMed, ISI, Google Scholar

    • 24

      Kempe DS, Lang PA, Duranton C, Akel A, Lang KS, Huber SM, Wieder T, Lang F. 2006Enhanced programmed cell death of iron-deficient erythrocytes. FASEB J. 20, 368–370. (doi:10.1096/fj.05-4872fje) Crossref, PubMed, ISI, Google Scholar

    • 25

      Lang F, Lang KS, Lang PA, Huber SM, Wieder T. 2006Mechanisms and significance of eryptosis. Antioxid. Redox Signal. 8, 1183–1192. (doi:10.1089/ars.2006.8.1183) Crossref, PubMed, ISI, Google Scholar

    • 26

      Bosman GJCGM, Lasonder E, Groenen-Döpp YAM, Willekens FLA, Werre JM, Novotý VMJ. 2010Comparative proteomics of erythrocyte aging in vivo and in vitro. J. Proteom. 73, 396–402. (doi:10.1016/j.jprot.2009.07.010) Crossref, PubMed, ISI, Google Scholar

    • 27

      Freikman I, Fibach E. 2011Distribution and shedding of the membrane phosphatidylserine during maturation and aging of erythroid cells. Biochim. Biophys. Acta 1808, 2773–2780. (doi:10.1016/j.bbamem.2011.08.014) Crossref, PubMed, ISI, Google Scholar

    • 28

      Oldenborg P-A, Gresham HD, Chen Y, Izui S, Lindberg FP. 2002Lethal autoimmune hemolytic anemia in CD47-deficient nonobese diabetic (NOD) mice. Blood 99, 3500–3504. (doi:10.1182/blood.V99.10.3500) Crossref, PubMed, ISI, Google Scholar

    • 29

      van den Berg TK, van der Schoot CE. 2008Innate immune ‘self’ recognition: a role for CD47–SIRPα interactions in hematopoietic stem cell transplantation. Trends Immunol. 29, 203–206. (doi:10.1016/j.it.2008.02.004) Crossref, PubMed, ISI, Google Scholar

    • 30

      Olsson M, Nilsson A, Oldenborg P-A. 2006Target cell CD47 regulates macrophage activation and erythrophagocytosis. Transf. Clin. Biol. 13, 39–43. (doi:10.1016/j.tracli.2006.02.013) Crossref, PubMed, ISI, Google Scholar

    • 31

      Tsai RK, Rodriguez PL, Discher DE. 2010Self inhibition of phagocytosis: the affinity of ‘marker of self’ CD47 for SIRPα dictates potency of inhibition but only at low expression levels. Blood Cells Mol. Dis. 45, 67–74. (doi:10.1016/j.bcmd.2010.02.016) Crossref, PubMed, ISI, Google Scholar

    • 32

      Oldenborg P-A, Zheleznyak A, Fang Y-F, Lagenaur CF, Gresham HD, Lindberg FP. 2000Role of CD47 as a marker of self on red blood cells. Science 288, 2051–2054. (doi:10.1126/science.288.5473.2051) Crossref, PubMed, ISI, Google Scholar

    • 33

      Ishikawa-Sekigami Tet al.2006SHPS-1 promotes the survival of circulating erythrocytes through inhibition of phagocytosis by splenic macrophages. Blood 107, 341–348. (doi:10.1182/blood-2005-05-1896) Crossref, PubMed, ISI, Google Scholar

    • 34

      Gottlieb Yet al.2012Physiologically aged red blood cells undergo erythrophagocytosis in vivo but not in vitro. Haematologica 97, 994–1002. (doi:10.3324/haematol.2011.057620) Crossref, PubMed, ISI, Google Scholar

    • 35

      Kiefer CR, Michael Snyder L. 2000Oxidation and erythrocyte senescence. Curr. Opin. Hematol. 7, 113–116. (doi:10.1097/00062752-200003000-00007) Crossref, PubMed, ISI, Google Scholar

    • 36

      Freikman I, Amer J, Cohen JS, Ringel I, Fibach E. 2008Oxidative stress causes membrane phospholipid rearrangement and shedding from RBC membranes—an NMR study. Biochim. Biophys. Acta 1778, 2388–2394. (doi:10.1016/j.bbamem.2008.06.008) Crossref, PubMed, ISI, Google Scholar

    • 37

      Khandelwal S, Van Rooijen N, Saxena RK. 2007Reduced expression of CD47 during murine red blood cell (RBC) senescence and its role in RBC clearance from the circulation. Transfusion 47, 1725–1732. (doi:10.1111/j.1537-2995.2007.01348.x) Crossref, PubMed, ISI, Google Scholar

    • 38

      Liu J, Guo X, Mohandas N, Chasis JA, An X. 2010Membrane remodeling during reticulocyte maturation. Blood 115, 2021–2027. (doi:10.1182/blood-2009-08-241182) Crossref, PubMed, ISI, Google Scholar

    • 39

      Burger P, Hilarius-Stokman P, de Korte D, van den Berg TK, van Bruggen R. 2012CD47 functions as a molecular switch for erythrocyte phagocytosis. Blood 119, 5512–5521. (doi:10.1182/blood-2011-10-386805) Crossref, PubMed, ISI, Google Scholar

    • 40

      Brown GC, Neher JJ. 2012Eaten alive! Cell death by primary phagocytosis: ‘phagoptosis’. Trends Biochem. Sci. 37, 325–332. (doi:10.1016/j.tibs.2012.05.002) Crossref, PubMed, ISI, Google Scholar

    • 41

      Droin N, Cathelin S, Jacquel A, Guéry L, Garrido C, Fontenay M, Hermine O, Solary E. 2008A role for caspases in the differentiation of erythroid cells and macrophages. Biochimie 90, 416–422. (doi:10.1016/j.biochi.2007.08.007) Crossref, PubMed, ISI, Google Scholar

    • 42

      Matozaki T, Murata Y, Okazawa H, Ohnishi H. 2009Functions and molecular mechanisms of the CD47–SIRPα signalling pathway. Trends Cell Biol. 19, 72–80. (doi:10.1016/j.tcb.2008.12.001) Crossref, PubMed, ISI, Google Scholar

    • 43

      Oldenborg P-A. 2013CD47: a cell surface glycoprotein which regulates multiple functions of hematopoietic cells in health and disease. ISRN Hematol. 2013, 614619. (doi:10.1155/2013/614619) Crossref, PubMed, Google Scholar

    • 44

      Gregory CD, Devitt A. 2004The macrophage and the apoptotic cell: an innate immune interaction viewed simplistically?Immunology 113, 1–14. (doi:10.1111/j.1365-2567.2004.01959.x) Crossref, PubMed, ISI, Google Scholar

    • 45

      Latour Set al.2001Bidirectional negative regulation of human T and dendritic cells by CD47 and its cognate receptor signal-regulator protein-α: down-regulation of IL-12 responsiveness and inhibition of dendritic cell activation. J. Immunol. 167, 2547–2554. (doi:10.4049/jimmunol.167.5.2547) Crossref, PubMed, ISI, Google Scholar

    • 46

      Ravichandran KS, Lorenz U. 2007Engulfment of apoptotic cells: signals for a good meal. Nat. Rev. Immunol. 7, 964–974. (doi:10.1038/nri2214) Crossref, PubMed, ISI, Google Scholar

    • 47

      Murata Y, Kotani T, Ohnishi H, Matozaki T. 2014The CD47-SIRPα signalling system: its physiological roles and therapeutic application. J. Biochem. 155, 335–344. (doi:10.1093/jb/mvu017) Crossref, PubMed, ISI, Google Scholar

    • 48

      Frey B, Gaipl US. 2011The immune functions of phosphatidylserine in membranes of dying cells and microvesicles. In Seminars in immunopathology, vol. 33, pp. 497–516. Berlin, Germany: Springer. Google Scholar

    • 49

      Pittoni V, Valesini G. 2002The clearance of apoptotic cells: implications for autoimmunity. Autoimmun. Rev. 1, 154–161. (doi:10.1016/S1568-9972(02)00032-0) Crossref, PubMed, Google Scholar

    • 50

      Peiser L, Mukhopadhyay S, Gordon S. 2002Scavenger receptors in innate immunity. Curr. Opin. Immunol. 14, 123–128. (doi:10.1016/S0952-7915(01)00307-7) Crossref, PubMed, ISI, Google Scholar

    • 51

      Sarfati M, Fortin G, Raymond M, Susin S. 2008CD47 in the immune response: role of thrombospondin and SIRP-α reverse signaling. Curr. Drug Targets 9, 842–850. (doi:10.2174/138945008785909310) Crossref, PubMed, ISI, Google Scholar

    • 52

      Sosale NG, Spinler KR, Alvey C, Discher DE. 2015Macrophage engulfment of a cell or nanoparticle is regulated by unavoidable opsonization, a species-specific ‘Marker of Self’ CD47, and target physical properties. Curr. Opin. Immunol. 35, 107–112. (doi:10.1016/j.coi.2015.06.013) Crossref, PubMed, ISI, Google Scholar

    • 53

      Oshima K, Ruhul Amin ARM, Suzuki A, Hamaguchi M, Matsuda S. 2002SHPS-1, a multifunctional transmembrane glycoprotein. FEBS Lett. 519, 1–7. (doi:10.1016/S0014-5793(02)02703-5) Crossref, PubMed, ISI, Google Scholar

    • 54

      Fadok VA, Bratton DL, Henson PM. 2001Phagocyte receptors for apoptotic cells: recognition, uptake, and consequences. J. Clin. Invest. 108, 957–962. (doi:10.1172/JCI200114122) Crossref, PubMed, ISI, Google Scholar

    • 55

      van Beek EM, Cochrane F, Neil Barclay A, van den Berg TK. 2005Signal regulatory proteins in the immune system. J. Immunol. 175, 7781–7787. (doi:10.4049/jimmunol.175.12.7781) Crossref, PubMed, ISI, Google Scholar

    • 56

      Lutz HU. 2012Naturally occurring autoantibodies in mediating clearance of senescent red blood cells. In Naturally occurring antibodies (NAbs) (ed. HU Lutz), pp. 76–90. Berlin, Germany: Springer. Google Scholar

    • 57

      Fossati-Jimack L, da Silveira SA, Moll T, Kina T, Kuypers FA, Oldenborg P-A, Reininger L, Izui S. 2002Selective increase of autoimmune epitope expression on aged erythrocytes in mice: implications in anti-erythrocyte autoimmune responses. J. Autoimmun. 18, 17–25. (doi:10.1006/jaut.2001.0563) Crossref, PubMed, ISI, Google Scholar

    • 58

      Mauri C, Bosma A. 2012Immune regulatory function of B cells. Annu. Rev. Immunol. 30, 221–241. (doi:10.1146/annurev-immunol-020711-074934) Crossref, PubMed, ISI, Google Scholar

    • 59

      Boes M. 2000Role of natural and immune IgM antibodies in immune responses. Mol. Immunol. 37, 1141–1149. (doi:10.1016/S0161-5890(01)00025-6) Crossref, PubMed, ISI, Google Scholar

    • 60

      Panda S, Ding JL. 2015Natural antibodies bridge innate and adaptive immunity. J. Immunol. 194, 13–20. (doi:10.4049/jimmunol.1400844) Crossref, PubMed, ISI, Google Scholar

    • 61

      Vas J, Grönwall C, Silverman GJ. 2015Fundamental roles of the innate-like repertoire of natural antibodies in immune homeostasis. In The evolution and development of the antibody repertoire (ed. HW Schroeder Jr), pp. 34–41. Lausanne, Switzerland: Frontiers E-books. Google Scholar

    • 62

      Gehrs BC, Friedberg RC. 2002Autoimmune hemolytic anemia. Am. J. Hematol. 69, 258–271. (doi:10.1002/ajh.10062) Crossref, PubMed, ISI, Google Scholar

    • 63

      Hume DA. 2008Differentiation and heterogeneity in the mononuclear phagocyte system. Mucosal Immunol. 1, 432–441. (doi:10.1038/mi.2008.36) Crossref, PubMed, ISI, Google Scholar

    • 64

      Lohse AW, Knolle PA, Bilo K, Uhrig A, Waldmann C, Ibe M, Schmitt E, Gerken G, Meyer Zum Buschenfelde KH. 1996Antigen-presenting function and B7 expression of murine sinusoidal endothelial cells and Kupffer cells. Gastroenterology 110, 1175–1181. (doi:10.1053/gast.1996.v110.pm8613007) Crossref, PubMed, ISI, Google Scholar

    • 65

      Yi T, Li J, Chen H, Wu J, An J, Xu Y, Hu Y, Lowell CA, Cyster JG. 2015Splenic dendritic cells survey red blood cells for missing self-CD47 to trigger adaptive immune responses. Immunity 43, 764–775. (doi:10.1016/j.immuni.2015.08.021) Crossref, PubMed, ISI, Google Scholar

    • 66

      Khandelwal S, Saxena RK. 2006Assessment of survival of aging erythrocyte in circulation and attendant changes in size and CD147 expression by a novel two step biotinylation method. Exp. Gerontol. 41, 855–861. (doi:10.1016/j.exger.2006.06.045) Crossref, PubMed, ISI, Google Scholar

    • 67

      Lutz HU, Bogdanova A. 2007Mechanisms tagging senescent red blood cells for clearance in healthy humans. In Regulation of red cell life-span, erythropoiesis, senescence and clearance (eds A Bogdanova, L Kaestner), p. 45. Lausanne, Switzerland: Frontiers E-books. Google Scholar

    • 68

      Fibach E. 2014Involvement of oxidative stress in hemolytic anemia. In Systems biology of free radicals and antioxidants (ed. I Laher), pp. 2499–2516. Berlin, Germany: Springer. Google Scholar

    • 69

      Arese P, Turrini F, Schwarzer E. 2005Band 3/complement-mediated recognition and removal of normally senescent and pathological human erythrocytes. Cell. Physiol. Biochem. 16, 133–146. (doi:10.1159/000089839) Crossref, PubMed, Google Scholar

    • 70

      Song J, Yoon D, Christensen RD, Horvathova M, Thiagarajan P, Prchal JT. 2015HIF-mediated increased ROS from reduced mitophagy and decreased catalase causes neocytolysis. J. Mol. Med. 93, 857–866. (doi:10.1007/s00109-015-1294-y) Crossref, PubMed, ISI, Google Scholar

    • 71

      Ditting T, Hilgers KF, Stetter A, Linz P, Schönweiss C, Veelken R. 2007Renal sympathetic nerves modulate erythropoietin plasma levels after transient hemorrhage in rats. Am. J. Physiol. Renal Physiol. 293, F1099–F1106. (doi:10.1152/ajprenal.00267.2007) Crossref, PubMed, ISI, Google Scholar

    • 72

      Hsieh C-H, Nickel EA, Hsu J-T, Schwacha MG, Bland KI, Chaudry IH. 2009Trauma-hemorrhage and hypoxia differentially influence Kupffer cell phagocytic capacity: role of hypoxia-inducible-factor-α and phosphoinositide 3-kinase/Akt activation. Ann. Surgery 250, 995. (doi:10.1097/SLA.0b013e3181b0ebf8) Crossref, PubMed, ISI, Google Scholar

    • 73

      Lifshitz L, Tabak G, Gassmann M, Mittelman M, Neumann D. 2010Macrophages as novel target cells for erythropoietin. Haematologica 95, 1823–1831. (doi:10.3324/haematol.2010.025015) Crossref, PubMed, ISI, Google Scholar

    • 74

      Rizzo AM, Negroni M, Corsetto PA, Montorfano G, Milani S, Zava S, Tavella S, Cancedda R, Berra B. 2012Effects of long-term space flight on erythrocytes and oxidative stress of rodents. PLos ONE 7, e32361. (doi:10.1371/journal.pone.0032361) Crossref, PubMed, ISI, Google Scholar

    • 75

      Olsson M, Nilsson A, Oldenborg P-A. 2007Dose-dependent inhibitory effect of CD47 in macrophage uptake of IgG-opsonized murine erythrocytes. Biochem. Biophys. Res. Commun. 352, 193–197. (doi:10.1016/j.bbrc.2006.11.002) Crossref, PubMed, ISI, Google Scholar

    • 76

      Chang K-H, Stevenson MM. 2004Effect of anemia and renal cytokine production on erythropoietin production during blood-stage malaria. Kidney Int. 65, 1640–1646. (doi:10.1111/j.1523-1755.2004.00573.x) Crossref, PubMed, ISI, Google Scholar

    • 77

      Vedovato M, De Paoli Vitali E, Bigoni L, Salvatorelli G. 2002Plasmodium falciparum: erythropoietin levels in malaric subjects. Comp. Clin. Pathol. 11, 148–152. (doi:10.1007/s005800200014) Crossref, Google Scholar

    • 78

      Burgmann Het al.1996Serum levels of erythropoietin in acute Plasmodium falciparum malaria. Am. J. Trop. Med. Hyg. 54, 280–283. Crossref, PubMed, ISI, Google Scholar

    • 79

      El Hassan AMA, Saeed AM, Fandrey J, Jelkmann W. 1997Decreased erythropoietin response in Plasmodium falciparum malaria-associated anaemia. Eur. J. Haematol. 59, 299–304. (doi:10.1111/j.1600-0609.1997.tb01690.x) Crossref, PubMed, ISI, Google Scholar

    • 80

      Rice L, Ruiz W, Driscoll T, Whitley CE, Tapia R, Hachey DL, Gonzales GF, Alfrey CP. 2001Neocytolysis on descent from altitude: a newly recognized mechanism for the control of red cell mass. Ann. Intern. Med. 134, 652–656. (doi:10.7326/0003-4819-134-8-200104170-00010) Crossref, PubMed, ISI, Google Scholar

    • 81

      Totino PRR, Magalhães AD, Silva LA, Banic DM, Daniel-Ribeiro CT, de Fátima Ferreira-da Cruz M. 2010Apoptosis of non-parasitized red blood cells in malaria: a putative mechanism involved in the pathogenesis of anaemia. Mal. J. 9, 1. (doi:10.1186/1475-2875-9-350) Crossref, PubMed, ISI, Google Scholar

    • 82

      Jakeman GN, Saul A, Hogarth WL, Collins WE. 1999Anaemia of acute malaria infections in non-immune patients primarily results from destruction of uninfected erythrocytes. Parasitology 119, 127–133. (doi:10.1017/S0031182099004564) Crossref, PubMed, ISI, Google Scholar

    • 83

      Menendez C, Fleming AF, Alonso PL. 2000Malaria-related anaemia. Parasitol. Today 16, 469–476. (doi:10.1016/S0169-4758(00)01774-9) Crossref, PubMed, Google Scholar

    • 84

      Fernandez-Arias C, Arias CF, Rodriguez A. 2014Is malarial anaemia homologous to neocytolysis after altitude acclimatisation?Int. J. Parasitol. 44, 19–22. (doi:10.1016/j.ijpara.2013.06.011) Crossref, PubMed, ISI, Google Scholar

    • 85

      Salmon MG, De Souza JB, Butcher GA, Playfair JHL. 1997Premature removal of uninfected erythrocytes during malarial infection of normal and immunodeficient mice. Clin. Exp. Immunol. 108, 471. (doi:10.1046/j.1365-2249.1997.3991297.x) Crossref, PubMed, ISI, Google Scholar

    • 86

      Evans KJ, Hansen DS, van Rooijen N, Buckingham LA, Schofield L. 2006Severe malarial anemia of low parasite burden in rodent models results from accelerated clearance of uninfected erythrocytes. Blood 107, 1192–1199. (doi:10.1182/blood-2005-08-3460) Crossref, PubMed, ISI, Google Scholar

    • 87

      Fernandez-Arias Cet al.2016Anti-self phosphatidylserine antibodies recognize uninfected erythrocytes promoting malarial anemia. Cell Host Microbe 19, 194–203. (doi:10.1016/j.chom.2016.01.009) Crossref, PubMed, ISI, Google Scholar

    • 88

      Cheuvront SN, Kenefick RW, Montain SJ, Sawka MN. 2010Mechanisms of aerobic performance impairment with heat stress and dehydration. J. Appl. Physiol. 109, 1989–1995. (doi:10.1152/japplphysiol.00367.2010) Crossref, PubMed, ISI, Google Scholar

    • 89

      Liu C, Li S, Liu T, Borjigin J, Lin JD. 2007Transcriptional coactivator PGC-1α integrates the mammalian clock and energy metabolism. Nature 447, 477–481. (doi:10.1038/nature05767) Crossref, PubMed, ISI, Google Scholar


    Page 2

    This paper introduces the new concept of interactive numerals. Interactive numerals highlight a wide range of hitherto unnoticed issues with entering and using numbers reliably with interactive computer systems. The reach of the concept is very broad, covering calculators, spreadsheets, medical devices, computerized forms, web sites, airplane cockpits—in fact, all forms of numerical data entry system.

    Interactive numerals highlight common avoidable defects in many interactive systems where people use numbers. Incorrectly implemented interactive numerals cause subtle and sometimes critical problems. A proper understanding of interactive numerals should help researchers identify related problems and seek suitable solutions, as well as help practitioners recognize design problems and to decide on effective responses to problems: whether re-implementing systems, training users, upgrading or replacing or even banning the systems.

    Understanding interactive numerals makes error management more reliable: deciding what to do after an error has occurred requires understanding the root causes of the error. In particular, until interactive numerals are properly implemented, using data logs in investigations of alleged user error is unreliable (devices tend to record what they do, not what the user tells them to do).

    This paper shows that poorly implemented interactive numerals are surprisingly widespread; even systems developed by the world’s leading programmers are not immune. In hospitals—to take just one example of a critical application of interactive numerals—clinicians routinely enter drug doses and radiation doses into interactive systems, where numerical errors can be fatal and are often treated as criminal. It is, therefore, important to properly understand interactive numerals and any limitations they have in current systems. Indeed, this paper shows that many numerical errors are caused, not by users, but by the poor design of the systems. We will present some worrying examples taken from a wide range of systems in the body of the paper.

    The implication is that interactive numerals are not well understood; this paper aims to correct that. This paper provides an analysis that should help everyone appreciate the causes of the problems and the limitations of current interactive systems. The aim is to begin to move towards more robust implementations of numbers in all digital devices, systems and services.

    We take it for granted that 2016 and 3.14 are numbers, but strictly they are numerals, ways of printing or writing number values. When we see 2016, we may think of a particular number, but what we see is not the number but strictly speaking it is just various ink patterns that we interpret to represent a numerical value. More specifically, the four ink patterns ‘2016’ make up an Arabic numeral, just like ‘two thousand and sixteen’ makes up an English numeral, and MMXVI is a Roman numeral. Note that the same number can be represented by many different numerals, and even by many different Arabic numerals (thus 003.14 is equal to 3.14, though 3.1400 is generally not considered exactly the same as the number represented by 3.14 because it is more precise).

    We now think of Roman numerals as being rather awkward. For example, the Romans had no sensible way of writing 3.14, and Roman numerals seriously limited how people could use and think about numbers. Nevertheless, after Fibonacci introduced Arabic numerals to Europe around 1200 [1], it still took centuries for Arabic numerals to become more popular: people were attached to the familiar, and anyway many people used counting boards and other devices for everyday arithmetic.

    Today we are facing a new transition, we argue as dramatic as the move away from Roman numerals to Arabic numerals. We call the new way of thinking about numbers interactive numerals.

    As well as popularizing decimal positional notation, Arabic numerals transformed arithmetic by introducing the cipher, an explicit symbol for zero. Arabic numerals, however, cannot distinguish between no number and zero, as ‘no number’ has no valid Arabic representation. It turns out that this and many other subtle problems are rife in interactive computer systems—and often cause problems that are hard for users to understand or avoid. Interactive numerals help us think clearly about such issues, and—properly implemented—directly address such issues, as we will show in this paper.

    Arabic numerals, which we are now very familiar with, are written down usually on paper or displayed on screens and then we read them as numbers. Interactive numerals look much the same but, unlike Arabic numerals which are only read and used after being written, interactive numerals are read and processed by a computer that has to represent them (at least on the screen if nowhere else) as they are being written. Thus interactive numerals consider the process of writing down a numeral, corrections and all, not just its final form as a full-fledged numeral. As a special case, interactive numerals handle the case of no numeral yet written. By contrast, nothing yet written is impossible to represent correctly as an Arabic numeral as it has no digits.

    Imagine typing the number −0.5 into a computer; unlike the traditional piece of paper, the computer has to make sense of each and all of the intermediate steps as we type

    What happens when a red blood cell dies?
    What happens when a red blood cell dies?
    , then
    What happens when a red blood cell dies?
    then
    What happens when a red blood cell dies?
    and so on. This raises many subtle problems. For example, – itself is not a number at all, and −0 is strictly equal to 0. Yet the computer has to decide how to represent the interactive numeral at every step, even before the user has finished entering it. In many cases, as we will show, the computer makes premature decisions about the representation that also change the meaning of what is being entered.

    Interactive numerals occur everywhere users enter numeric data into computers. Put very briefly, the problem is that not all interactive numerals are Arabic numerals, but assuming they are—which is a common conceptual error—leads to problems and use errors that are very hard to recognize, let alone correct. Indeed, in our early work [2], where we noticed there was a problem but had not fully grasped its extent, we did not make a sharp distinction between numerals and numbers, and we fell victim to some confusions ourselves. The new contribution of this paper, then, is not just bemoaning the problem, but providing a formal framework wherein problems can be identified and solutions can be worked out.

    Unfortunately, as we will show, problems with interactive numerals are widespread. When mathematicians, programmers and scientists notice and can reason about these problems and hence seek solutions, every user (and their work) will benefit—even if they do not notice any changes.

    Numerals are sequences of symbols that follow agreed conventions for interpreting them as number values. In this paper, we consider digit-based systems of numerals for representing numbers rather than, say, counters or linguistic word-based numerals like English phrases.

    Given a set of symbols S, generally called digits, a numeral is a non-empty string of symbols S+ together with a surjective function N that maps each string to a numeric value. Typically, N can be expressed explicitly in a simple arithmetical way. Note that special symbols, such as π∈S, may or may not be considered numerals depending on the context.

    What happens when a red blood cell dies?

    There are trivial generalizations so S can include signs (+, −), radix points1 and digit block separators (e.g. the space or comma), and of course generalizing to other bases than 2 is trivial. Further notation may be used to represent precision, choice of base and so on [3].

    What happens when a red blood cell dies?
    Since N defines an equivalence class, if for two numerals x,y we have N⟦x⟧=N⟦y⟧ we tend to say x equals y or even that x is y. It is therefore common and very easy, and frequently harmless, to confuse numerals and numbers.

    Now, depending on how N is implemented (programmed) from its specification above, it can interpret a numeral like 11.1 in different ways:

    N⟦11.1⟧={1 (by stopping at the radix point, reading from the right)3 (by stopping at the radix point, reading from the left)7 (by ignoring the radix point)9 (by using `subtract ASCII code of 0' to get values of symbols)guarded so it cannot occurundefinedreport an exceptionexcluded by type checking at compile time, since S cannot contain‘.′

    but in no case will it have the ‘obvious’ value of 3.510 (11.1 in binary is 3.5 in decimal), at least if sensibly refined from our explicit specification above. This is obviously something of a toy example, but it highlights that any implementation of evaluation on a computer has to make explicit things so the computer can do them that are not necessarily explicit in the specification.

    One fundamental implementation problem that is unavoidable is that computers, being finite, cannot implement the natural numbers, N (let alone the real numbers, R, etc), correctly. At the very least, any implementation of numerals must decide how to limit the infinite and, if done well, how to minimize and manage the impact of the ensuing errors. (Using arbitrary precision numbers would obviously help.) Implementations that neglect the impact of such fundamental problems are, as we will see, likely to cause severe problems.

    The definition of numeral, above, takes a numeral to be a fixed entity that denotes a numerical value. By contrast, an interactive numeralincludes the process intended to create a numeral. In other words, an interactive numeral additionally specifies the procedural implementation of a numeral, in contrast with the conventional purely declarative specification of an ordinary numeral, which abstracts away from the process of construction.

    An interactive numeral may therefore be defined by a vector of string buffers bi∈B, an initial contents b0=d (sometimes called the ‘default,’ possibly the empty string), a set of actions A:B×N→B to modify buffers, i.e. the action a on buffer bi gives bi+1=a(bi,i), together with a surjective function Nint to map buffer strings to realizable numeric values V ∪E including exception and other conditions E. Buffers generally have associated invariants, for instance on their length, e.g. ∀i:0≤|bi|≤8.

    An implementation will usually optimize buffers into a single object, such as a string, and update it on each action rather than creating a new buffer each time. In many incorrect implementations of interactive numerals, buffers are optimized directly to a numeric value, which of course cannot correctly represent exceptions (such as entering too many digits or numerals with as-yet incorrect syntax).

    For example, pressing the digit 1 is an action that will create a buffer bi+1=bi1 by appending 1 to the previous buffer bi. However, if bi is too long, then 1 cannot be appended, and some exception condition will be flagged (or be ignored or be handled inappropriately) and typically we will have bi+1=bi (for some i). Interactive numerals may additionally having timed actions, so for instance if there is a timeout (e.g. when ‘no other action occurs for 1 minute’), then typically bi+1=d.

    A realizable value is an implementable representation of a numeric value; for example, for a cash machine (ATM) it may be 10n,n∈N:1≤n≤25; or it may be an IEEE 754-2008 floating point number, with or without rounding and exceptions. By contrast, for a conventional numeral, the range of valid numerals and values are rarely if ever explicitly specified. While IEEE floating point has peculiar properties that may ‘leak’ into the user interface (we give examples below), arbitrary precision reals do not avoid problems as the buffer invariants cannot be broken, and in any case buffers are not required to be real numerals (e.g. because of syntax errors).

    Often there are conventional actions, such as an action

    What happens when a red blood cell dies?
    ∈A to ‘submit’ a buffer (and perhaps leave it unchanged), an action to clear a buffer
    What happens when a red blood cell dies?
    (bi,i)=d, and so on. When actions can include mouse movement,
    What happens when a red blood cell dies?
    and
    What happens when a red blood cell dies?
    the buffer is generally refined to include pointers or stacks. In any case, note that A must include every possible action, not just a small set of conventional digits—the set A has to cover the possibility that the user can do anything. If the user presses a non-numeral like the space bar, what should happen? In particular, we cannot rely on type checking (as might forbid the example of N⟦11.1⟧ above) because every input is valid—type checking does not control what a user does.

    In short, an interactive numeral includes how the numeral is dynamically created, giving it a continual interpretation, including for all its intermediate representations and exception conditions. In contrast with numerals, then, Nint for interactive numerals is complex—its definition, even without the algorithm, is far more complex than a conventional numeral. We therefore do not show a specification of Nint here.

    What happens when a red blood cell dies?
    Since numerals ‘are’ numbers in common thinking (see note in §3.1 above), it is very easy to turn a blind eye to the daunting complexity of Nint, considering it instead to be effectively N, and therefore to gloss over considering all the exception cases. This is a tempting category error, and one with consequences this paper explores.

    Strings of symbols that look like numerals are frequently used in contexts that might be considered more strictly to be names or identifiers. We call these quasi-numerals.

    Bank account numbers and book numbers (ISBNs) are examples of quasi-numerals. Despite commonly being called numbers, arithmetical operations (such as doubling or adding one) make little sense with such numbers. In other respects, however, they behave very like interactive numerals. They are entered as digits and have a structure, and the value denoted must fall in a particular range to be valid. In fact, the range of correct values may be highly constrained by hidden arithmetic operations that specify a check digit as part of the numeral. (Incorrect check digits should cause exceptions.)

    But bank account ‘numbers’ do not denote members of a set of conventional numeric values like N, despite being countable. Similarly, many strings of symbols that are not just digits, such as passwords or people’s names and social security numbers, are used in a similar sense of denoting members of sets. The decision of what is a digit and what is not within any character code is an arbitrary convention (e.g. is 0 a digit or a capital letter?), so inevitably the boundary between numerals and general strings is blurred.

    The user interfaces that permit quasi-numerals (bank account numbers, car registration numbers, personal ID numbers and so on) to be created are very similar to the user interfaces that permit interactive numerals to be created. Sometimes, and confusingly, the user interfaces are identical. Therefore, there is considerable and interesting overlap in design issues across the various types of numeral and quasi-numeral. In this paper, however, we are particularly concerned where, somehow, the user interface must preserve relevant arithmetical properties of the numbers the interactive numerals are taken to represent. There is a large area of concern where the natural conception—the user’s natural conception—of N should be very simple but in fact confusingly fails to be so.

    Interactive numerals arise in many areas and applications, from web pages, spreadsheets, to handheld calculators, as well as in special purpose instrumentation, burglar alarms, medical devices and more. However, it is important not to confuse interactive numerals for the applications that use them. In many areas the applications blur the distinctions, which plays into our natural tendency to equate numerals and numbers.

    Spreadsheets make the problems particularly acute. A cell in a spreadsheet can certainly allow the user to enter an interactive numeral. However, the number value it finally denotes depends on the formats and formulae in its and other cells. For example, if the numeral ‘5.5 kg’ is entered in Microsoft Excel, if it is subject to the operation SUM it has value 0, but subject to PRODUCT it has value 1.2 Furthermore, if the numeral entered is ‘1.’ it will typically be displayed as exactly 1, without a decimal point. These interactions are complex to explain in adequate detail, and can be confusing traps for both spreadsheet implementors and users.

    Another area of confusion is with calculators, which come as basic calculators (left-to-right operator precedence), so-called ‘algebraic’ calculators that respect conventional operator precedence, and reverse Polish notation (RPN) calculators. In all forms of calculator the issues of interactive numeral are very similar; however, the main differences between types of calculator are in arithmetic expression parsing—which is independent of interactive numeral treatment. In particular, while RPN calculators are popular among certain communities, they have serious problems that interfere with their otherwise consistent arithmetic parsing: the RPN stack is generally limited to a small size, and stack overflow has undefined effects [4]. Such complex and invisible problems, in our opinion, eclipse their alleged benefits.

    There are many generalized types of numeral, ‘things that denote number values’—such as dice, dials, increase/decrease chevron buttons, speech, clocks, counters, etc. [5], as well as names (such as month names denoting month numbers)—but the focus of this paper is on the dominant case of left-to-right sequential typing, canonically typing and entering conventional Arabic numerals to specify numeric values for interactive devices, computer programs or other applications.

    We note that sensors may be considered a broad generalization of numeral for when physical or other processes rather than humans define numeric values. Clearly, to measure a temperature, either a thermal sensor can be directly connected to a computer, or a human can use a conventional thermometer and enter a numeral denoting the temperature that has been ‘read off’ numerals shown on the thermometer. Sensors, however, often employ numeric transforms such as low pass filters to improve measurement dependability.

    One of the best inventions of computers is the delete key, which allows us to correct mistakes. Suppose you want to enter 0.5 but you mistakenly enter

    What happens when a red blood cell dies?
    , with two decimal points. You will want to correct this error. But try this on almost any calculator, and you will find that the calculator has already ‘corrected’ your error: it has decided you entered 0.5 before you have even started correcting it. Ironically, that means if you do correct the error, you will make things worse, perhaps entering 5 instead. The computer decided what you were entering before you had finished entering it; it ignored the second decimal point, which then made correcting it counter-productive.

    The computer (or, rather, the computer’s programmer) wants to ensure syntactically correct numerals so they can always be represented as simple number values along with allowing you to correct them with delete. Specifically, the computer is programmed to display a valid number because that is how the value is represented internally (e.g. as a floating point number). But numbers cannot represent all interactive numerals: for example, nothing with two decimal points has any number value to represent it, but it can be corrected to be a valid numeral. It follows that trying to do both leads to unreliable and inconsistent behaviour. This is a problem, one of ‘premature semantics’: computers generally try to treat the user’s errors as valid numbers too soon when no sensible numerical meaning can or should be assigned to them at that point. Premature semantics is explained more fully below, in §5.1.

    What happens when a red blood cell dies?
    It may be argued that a user might have entered
    What happens when a red blood cell dies?
    intentionally, for instance because they know how the user interface works, and pressing decimal point twice increases the chance it has been correctly pressed at least once. If so, what is the problem? The problem is that the programmer has prematurely committed the user’s probable error to be a valid number, and the error and its consequences are now unknown to the user. The programmer does not know what a user intends, and an error is usually indicative of a failure to carry out an action as intended. For example, the user may have intended to enter 0.05, but because the
    What happens when a red blood cell dies?
    and
    What happens when a red blood cell dies?
    keys are close, the
    What happens when a red blood cell dies?
    key was pressed accidentally. The programmer has no idea. Throwing away information on an assumption is dangerous, and avoidable.

    If you set out to enter −3 on the Apple MacOS v. 10.11.6 calculator, the key sequence

    What happens when a red blood cell dies?
    leads to the result of 3 (cf. figure 1).3 The change sign is probably ignored because −0 did not seem meaningful to the designer: −0 is equal to 0, which is what is displayed, so the change sign key had no effect!

    What happens when a red blood cell dies?

    Figure 1. Three contemporaneous calculators, all the latest versions as of August 2016. As explained in the paper: (a) the error in the iOS calculator occurs when the user tries to enter

    What happens when a red blood cell dies?
    under some circumstances when it displays −0; it ought to display 0, not Error. (b) NaN occurs when the user tries to enter
    What happens when a red blood cell dies?
    under some circumstances when the MacOS calculator displays 0; it ought to display −0. (c) A 40 digit number entered by the user and displayed in the MacOS app calculator is impossible to read, being about 1 mm high on the original screen. The calculator displays only some of the least significant digits of the number, so the display is misleading even when it can be read. (a) Apple iOS, (b) Apple MacOS pane and (c) Apple MacOS app.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 2. Google’s Android calculator calculating 56=0.8333…. In the right-hand image, scrolling the answer left/right produces a bizarre multiplier (E−17, presumably meaning ×10−17), and a hidden decimal position (off the left of the screen and out of sight), thus resulting in a meaningless result. (Compare with the Apple calculator’s use of reduced font size shown in figure 1c; neither approach works correctly for the user.) Screenshots provided by Martin Atkins.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    And if you correct some mistakes, say typing

    What happens when a red blood cell dies?
    you get NaN (on MacOS), which means Not a Number. NaN is an internal value representation (e.g. defined in IEEE floating point standards) that means nothing to non-technical users—clearly, displaying it to general users is a fault in the calculator itself: it has ignored an error, and its number representation code has failed to display it as either a number or as an error message.

    Typing

    What happens when a red blood cell dies?
    crashes Apple’s iOS v. 9.3.3 calculator (the effect of
    What happens when a red blood cell dies?
    on iOS is achieved by swiping the finger left/right across the numeric display). But the same sequence displays 0 on both Apple’s MacOS calculators (the stand-alone app and the notification pane calculator).

    Keying

    What happens when a red blood cell dies?
    gives −8 (MacOS pane calculator) or –89 (MacOS app calculator). The key
    What happens when a red blood cell dies?
    is handled differently too: on iOS,
    What happens when a red blood cell dies?
    gives −89 but on MacOS it gives 89 on both calculators, but
    What happens when a red blood cell dies?
    gives either −89 or 9.

    So three modern calculators from a leading manufacturer that are clearly designed to look the same, which all have the same ‘look and feel’, and therefore would be expected to be identical in behaviour, all implement the same actions differently—and all behave incorrectly by any standard interpretation. Arguably they should be consistent, even if we disagree about what should be correct behaviour.

    Evidently, handling interactive numerals properly is problematic. It is surprising that calculators, which have had their current form for over 40 years, are still implemented inconsistently and incorrectly. Defects of this sort are rife in all data entry devices, not just calculators.

    Cut and paste, like delete, is a basic feature of modern computers: in particular, typing text and pasting text typed elsewhere should be interchangeable. The Apple calculators, however, respond to pasting differently. If you type

    What happens when a red blood cell dies?
    then
    What happens when a red blood cell dies?
    , all calculators will display 5 correctly. If instead you paste 2+3=, the calculators respond differently: iOS and MacOS pane give 2, but the MacOS app calculator gives 23 (though it gives 2 if 2–3 is pasted, so + and − are treated differently). None report an error or give any indication that keystrokes are being adjusted or discarded. But cut and paste can be done correctly: in the Windows 7 calculator v.6.1, for instance, pasting ‘2+3=’ produces a 5 in the display.

    To give another example from Apple’s calculators: typing

    What happens when a red blood cell dies?
    9999999999999…, a number almost as large as you like, will display as exactly 999,999,999 (that is, nine digits on the iOS and pane calculators) and then doing
    What happens when a red blood cell dies?
    will display 1e9 (i.e. meaning 109) which is wrong as the correct answer is much larger. However, on the Apple MacOS app calculator, as larger numbers are entered they get displayed in progressively smaller and smaller fonts, and finally become so small they are completely unreadable. Then pressing
    What happens when a red blood cell dies?
    , as before, the large number is presented in a clear full-size 1E43 (or whatever). In other words, the calculators can display large numbers, but not when the user enters them directly!

    Note that E and e are different notations, and in fact neither are the standard mathematical notation. In standard notation, 1E43 or 1e43 should be written 1043 or more generally, a number like 2.3E43 (or 2.3e43) would be written 2.3×1043, which all of the Apple calculators, with their high-resolution displays, are technically able to display.

    Curiously, when the iOS calculator is turned sideways (to make it landscape) it becomes a scientific calculator and the user can then enter large numbers directly—except using a key labelled

    What happens when a red blood cell dies?
    , not
    What happens when a red blood cell dies?
    , one speculates because having two very similarly named keys, namely
    What happens when a red blood cell dies?
    (the base of natural logarithms) and
    What happens when a red blood cell dies?
    (raising to a power of ten), would have been thought to be too confusing. We would argue, then, if the key names would be confusing, this is another reason not to use the non-standard notation E (or e) at all in numerals!4

    Although we have used Apple for the concrete examples above, other leading manufacturers, such as Casio and Microsoft, are not immune to these problems. It seems that before clarifying the concept of interactive numerals (as we do in the present paper) or equivalent, nobody thought about these details, probably as numeric data entry seems trivial and does not require much thought to seem to get working. But interactive numerals are not obvious and there has been a failure to learn and employ rigorous computational thinking throughout design—apparently nobody considered it necessary! The result is that there is no standard approach. Swapping one calculator for another will give different results for exactly the same calculation, and (as we have shown elsewhere) there are also many other bugs in calculators [4]. It is dangerous.

    We note that people use calculators primarily because they do not know or are not certain of the answers, so bugs in calculators are very hard for them to spot and work around.

    What happens when a red blood cell dies?
    Though these details may sound like nit-picking, they have real effects. For example, two students narrowly avoided death after an incorrect calculation performed on a mobile phone [6].

    Doing a calculation more than once, and in a different way remains good advice, not just because it helps protect against user error but it also helps protect against design error.

    We have elsewhere given examples of medical devices with interactive numeral problems [7–11] too. Another example, shown in figure 3, is taken from a ‘clinically validated risk calculator’. Evidently, clinical validation is not sufficient to check for the sorts of interactive numeral errors that are shown in the figure—which of course could undermine any clinical value of using the calculator.

    What happens when a red blood cell dies?

    Figure 3. Screen shots from QRISK2, a clinically-validated risk calculator, available at www.qrisk.org. (a) Invalid numerals entered as data; (b) apparently valid result. Note that no errors are reported by QRISK2, so the user may be unaware of any problems, and then will act inappropriately. These screen shots were taken in October 2016.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Screen shots from an award-winning app are shown in figure 4. The figure shows the problem of interactive numerals getting too large for their display regions; what is apparently displayed as a patient weight of 60 kg is much larger, but the value is too large to fit in the small display. The app should not allow numbers to overflow their input fields, because they are ambiguous when they do. The result will be confusion, hopefully noticed by the user and before potential patient harm.

    What happens when a red blood cell dies?

    Figure 4. Screen shots from Mersey Burns, a multi-award winning app. The app is available from merseyburns.com. (a) Weight of patient in kilograms: apparently 60 kg is displayed; (b) calculated dose, from screen shot: the huge dose is clearly incorrect. In this example, no errors are reported by Mersey Burns so the user may act inappropriately from the advice displayed. Here, the calculated dose looks very obviously wrong (the first 7.54 h dose is over 109 l h−1, compared with an average adult blood capacity of 5 l): evidently the app does no or inadequate validation and sanity checking. More generally, less obvious errors may be missed by clinicians using this app, with potentially harmful effects. Note that Mersey Burns is used for treating burns victims—who may be in considerable pain, and whose treatment will be urgent—so what may seem obvious to us calmly reading this paper may be easily misinterpreted under real clinical conditions. Screen shots taken in November 2016.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    It is interesting that despite the obvious programming problems, as illustrated in the figure, the app not only won prizes but is well known for being the first British ‘CE marked’ medical app. CE marking is a European legal mark, here meaning that an app is approved for clinical use in Europe, so evidently the CE approval process failed in this case to adequately check interactive numeral processing. We know many other problematic medical devices that have CE marks [7–12], so this implies the CE marking process is flawed as it overlooks serious design defects that may adversely affect patients.

    In 2007, Greta Fossbakk lost 500 000 Norwegian Krone (equivalent to about US$100 000) due to an interactive quasi-numeral bug, a problem that still persists in many bank systems [10,13]. Fossbakk keyed in an extra digit in an account code that, despite making it an invalid account number, was not caught by the online banking system (figure 5). Her bank argued that she could not prove that she keyed 12 digits—as with other user interfaces this paper discusses, it seems to be no coincidence that the user interface does not log what the user does, which is to the bank’s advantage. Olsen’s [13] experiments with users entering bank account details suggest that 0.2% of all interactive bank transactions will suffer from a similar (unnecessarily) undetected error: given the scale of internet banking, this is a huge unnoticed problem.

    What happens when a red blood cell dies?

    Figure 5. Fossbakk used her bank’s user interface to transfer money to her daughter’s account. Unfortunately she double-pressed a key, resulting in a longer account number. The user interface truncated the number, thus obtaining another, but valid, account number. Thinking she had keyed the correct number, Fossbakk confirmed the transfer, despite it being to a differently-named account, which apparently was not displayed, checked or confirmed. Details taken from Olsen [13].

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Apart from not silently truncating numbers, the problem would also be reduced if numerals are displayed with separators (spaces, dashes, etc.), breaking the digits up into groups thus making them easier to read and, indeed, easier to type correctly. Thus, because the computer system failed her, Fossbakk alone had to notice the difference between 71581555022 and 71581555502; this would have been much easier if the numerals had been displayed as 715-815-555-502 and 715-815-555-022, which are more obviously different. Indeed, it is concerning that credit card numbers are usually displayed with account numbers split into chunks, but user interfaces typically delete the gaps thus deliberately making the numerals much harder to read, type and check.

    Problems with interactive numerals are widespread, and users like Fossbakk are often penalized for design failings. We suspect that manufacturers are in denial for a combination of reasons: they do not explore the issues because of an unfortunate interpretation of product liability (they do not want to discover any problems that may be their fault) and by assuming their ‘competent’ programmers would never make mistakes over something so ‘trivial’. And of course, logging exactly what users do and exactly how systems respond to them would expose manufacturers to the truth, which may work against them in court.

    IP numbers (both IPv4 and IPv6) are made up from blocks of digits. IP version 6 numbers are blocks of four hexadecimal digits separated by colons. Each block can have leading zeros suppressed, so 0012 and 12, for instance, would be treated as identical.

    In 2011, Nigel Lang was arrested for downloading illegal material onto a PC owned by his wife. The police had added an extra digit onto somebody else’s IP number, obtained his wife’s physical address, then assumed he had committed an offence (males are more likely to commit internet offences). His arrest and subsequent stigma ruined Lang’s family life. Fortunately, his efforts to defend his innocence eventually resulted in a settlement of £60 000 from the police for making the errors [14], that had made him and his family suffer considerably.

    It is alarming that a single digit error in a user interface can have such unchecked consequences—as well as taking over 6 years to resolve. If nothing else, the design of IP numbers (strictly, IP numerals) and IP user interfaces used by the police (and others) shows little maturity about human factors, such as using check digits (which are common on ISBNs, bank account numbers, etc.).

    Forms often include interactive numerals. In addition to problems with individual interactive numerals (which are generally as discussed above), forms often impose constraints on the relations between some or all numerical values they allow, as well as how they can be entered.

    For example, dates may be entered on a form as a triple of numerals: the day number, the month number and the year number. If the user edits the date 12/30/17 (representing 30 December 2017), say, to 2/28/17 (perhaps starting by deleting the first 1 in 12/30/17) there are possible intermediate values that are not valid dates (such as 30 February)—yet the user interface must permit editing to allow the date to be corrected. Many (bad) forms force the user to make a circuitous edit so the date is always valid; if so, this is a case of premature semantics, which we discuss more generally in §5.1 below.

    We note that, as with many interactive numerals, it is rarely possible for a user to save a form before it is valid. For a single interactive numeral, this may be a useful precaution as ‘saving’ and ‘acting on’ a single value are closely related and perhaps too easily confused, but with a form the user may be inhibited from saving many values (such as names and addresses); in this case, the imposition of premature semantics seems to us to be a serious step backwards from paper forms, which of course save the user’s work continually regardless of its validity.

    Although the examples come from diverse contexts, the problems can fall into one of two types: problems of construction and problems of representation. Interactive numerals are constructed by the user but as they are being constructed they are also being interpreted by the system, at the very least so they can be represented in the display. People, though, make mistakes and so in the process of constructing interactive numerals they can be expected both to make mistakes and to try to correct those mistakes. Mistakes are also made by programmers—typically in unconsciously assuming that the input of numbers is trivial.

    In the case of Fossbakk, the mistake was not detected by the computer even though it could have been. In the calculator examples, unlike bank accounts, it is not possible to know what numbers or sorts of numbers are intended by users, but it should be expected that users will attempt to correct mistakes they notice. However, as we showed, the use of standard features like using the delete key, which is supposed to help can, ironically, lead to confusion.

    These are serious problems. In our experiments [15,16], we have found that users regularly enter numbers incorrectly and nearly 4% of the time they do not notice. (If you think this rate is high, then that is because you are not noticing your own errors, and you mistakenly think the rate should therefore be lower.) Some novel styles of entering numbers can reduce the error rate to experimentally undetectable levels [17].

    We note that standard texts on human factors and error (e.g. Reason’s classic error taxonomy [18]) rarely discuss noticing error, and hence rarely note the important role that computers can have in helping users notice (and hence correct or manage) numerical and other errors, particularly because computers could in principle help notice errors and, in particular, help notice many sorts of errors humans are poor at noticing (such as errors in check digits). Insidious cases like those discussed in Moore [19] about radiotherapy computer systems show that numerical and calculation errors caused by mixes of poor programming and innocent use error can persist unnoticed for decades, and hence affect thousands of people.

    Other common methods of construction, such as cut and paste, fail to alert the user to interpretations of what was constructed that deviate substantially from what was pasted.

    The problems of representation are unavoidable for interactive numerals because the numbers must be represented and are unbounded, but displays are finite. Either a display truncates the interactive numerals leading to incorrect values or attempts to display them leading to the problems of unreadable displays as seen in both calculators and medical apps. The representations also fail to match with existing mathematical notations and moreover do not allow users to use such notations in a consistent way.

    The conclusion from these various examples must be that this is not simply manufacturer sloppiness so much as interactive numerals are deceptively different to standard Arabic numerals; they are a newly identified and surprisingly complex phenomenon.

    Indeed, realizing that interacting with numbers is complex and a serious design challenge [2] (issues we now recognize as falling within the scope of interactive numerals), we have sought and found various ways of reducing the error rate [20,21]—ways that would be invaluable in critical applications such as finance, avionics and medicine. The ability to reduce error rates with improved approaches to interactive numerals proves that current accepted ways of entering numbers are unnecessarily unreliable.

    The science of interactive numerals (and other classes of user interface design issues) has not been extensively studied, particularly in the HCI (human–computer interaction) field, which one might feel would be its natural home. By contrast, say, to the widespread activity in security [22], this lack of attention seems remarkable (CHI+MED is an exception [23]). Why? Although the final outcomes of security problems and safety problems may be broadly indistinguishable, there are interesting cultural and economic differences. For security, there are many outsiders who are known to be bad, and therefore defending against them is expected; investment in security (and hence investment in security research) makes sense. By contrast, for safety, there are insiders who are expected and often required to be professional; blaming the user is then an easy option as the user has apparently failed to do a good job—they are the proverbial bad apple. For safety, then, it often seems the user has failed rather than the system. Hence little investment or research is demanded for user interface safety.

    Since poor interactive numeral design induces use error, we hope accident and other investigators will consider it as a possible cause of incidents, and thus we hope this paper will stimulate new avenues of critical investigation and research that will reduce error and its consequences.

    It has long been recognized that computers do not do normal arithmetic. Thus, normal addition satisfies the associative rule, a+(b+c)=(a+b)+c, but on a computer this rule fails because of rounding errors. For example, 0.1 as a decimal number is a recurring binary fraction, so it cannot be represented as a simple binary value precisely.5

    The field of numerical analysis concerns itself with such error analysis, so-called ill-conditioning and related issues [3]. Good programmers are therefore very careful to do arithmetic in ways that take numerical problems into account. Some calculators do arithmetic in base 10 (e.g. using binary coded decimal, BCD), so that users are not surprised by decimal–binary conversions causing unexpected rounding.

    The corresponding problems of interactive numerals have only just been recognized (they are systematically reported here for the first time), but they are no less serious, as our examples above show. Unfortunately there has been nothing like ‘a numerical analysis’ for interactive numerals, so we aim to address that.

    The problems of interactive numerals fall into the following categories:

    Premature semantics is an idea we identify in this paper for the first time.

    For a computer program to interpret the user’s data entry, it must impose some semantics. For example, it may interpret numerals as floating point numbers; this semantics is very easy to implement and means that as a user enters a number it must be a floating point number at all times. Using the definition of interactive numerals from §3.2, the program will use Nint to evaluate the numeral bi at each step i as the user enters it; furthermore, any user action that makes Nint fail (i.e. fail to be a floating point value) will be ignored. As an interactive numeral is being constructed under this form of premature semantics, then at every stage it is interpreted as a meaningful number. But this means the program has prematurely implemented the user’s input as a number before they have finished interacting and confirmed it. In particular, it means that an erroneous sequence of interactions like

    What happens when a red blood cell dies?
    (and many others) cannot be processed correctly because the program has prematurely implemented the user’s input as a number and there is no floating point value that represents anything with two decimal points!

    In other words, although the program eventually wants a number, in this case it has prematurely imposed the semantics of number instead of, for instance, string semantics (which can represent arbitrary keyed input from a user).

    The example NaN given above is a symptom of premature semantics. Here, the computer has been programmed to prematurely use a number to represent the user’s input, but the user has attempted some interactive numeral action implemented on a number that the programmer has not correctly anticipated. This results in the program performing a non-numeric operation on what was prematurely (and incorrectly) programmed as a number value. The computer itself has responded with the number turning into an invalid number, namely NaN. Then the bad programming has failed to recognize this, and NaN (which is a technical term that is meaningless to normal people) is displayed directly to the user. In other words, displaying NaN proves the program has prematurely implemented something as a simple number (probably an IEEE float).

    Shneiderman describes a common data quality problem [24]: a hospital was analysing the age distributions of patients and finding statistically significant differences, etc.; but Shniederman found patients who were 999 years old. Of course, clinicians do not always know a patient’s age but the program they used prematurely required a number (and nothing else). A patient’s age is certainly a number, but interactively there is a point before the number is correct or even known, as in this case. But premature semantics requires a number regardless. The clinicians thus have to find a workaround, such as entering 999, which is a number for the computer but it is not a number for the clinicians—it is an exception flag denoting ‘unknown.’ Unfortunately, 999 was an exception flag known to the users, but was not implemented by the programmers.

    It is understandable that any such system needs some way of handling missing data but forcing a semantically correct number value here led to (perhaps well-intentioned) violations, a specific type of error [18] that can have serious consequences later. Indeed, this 999 violation worked every time, so it became standard practice. Yet it ruined data analysis, and probably disrupted hospital auditing and hence the day-to-day efficient operation of the hospital before Shneiderman uncovered the practice.

    Now we recognize premature semantics as such, it explains a wide range of familiar problems. Web forms, for example, often require a user to fill in details they do not yet know: this is because the form implements the fields as simple values (numbers, dates, etc.), which premature semantics prevents from being left blank or partly filled in—a routine procedure that is very convenient and indeed trivial on paper forms.

    A special case of premature semantics might be called premature representation—the representation is a consequence of an internal, premature, semantic choice. Many systems display a numeral even before anything has been entered, because the display is assumed to always display a number:

    • — Many interactive systems display 0 if the user has entered nothing: they then cannot distinguish a user who has entered nothing from one who has explicitly entered zero.

      What happens when a red blood cell dies?
      More generally, interactive systems displaying default values (such as 0) when the user has entered nothing makes it very difficult to distinguish a user entering a default value and actually doing nothing—not even recognizing ‘they’ have entered a value, which may therefore not be what they intended.

    • — Some systems display 0 to show that they are switched on, but this creates the ambiguity whether the user has started to interact or not. Instead, they could display patterns like

      What happens when a red blood cell dies?
      rapidly alternating with
      What happens when a red blood cell dies?
      so they are obviously on and working.

    • — Many systems permanently display a decimal point [25], so if a user has entered nothing they will display 0. with a decimal point. This means that the user cannot tell whether they previously pressed just

      What happens when a red blood cell dies?
      or
      What happens when a red blood cell dies?
      : they cannot know from the display whether pressing
      What happens when a red blood cell dies?
      will change the number to either 0.5 or 5. This lack of predictability is a high price to pay for the ‘simplicity’ of always displaying a decimal point even when one has not been entered.

    Feature interaction is the name given to unanticipated interactions between system features [26]. Feature interaction is hard to anticipate because we focus so readily on individual features and think about them independently: it is much harder to think about features working together. In fact trying to think about two things at once interferes [27]: we do not do it very well and we tend to avoid doing it.

    On calculators one feature is that we want users to be able to enter large numbers, and another feature is that the display screen can only show up to a fixed number of digits, typically 8 or so. The feature interaction here happens when a user keys in more than the allowed 8 digits—what should happen? Each feature separately makes sense, and in fact each feature is so straightforward and obvious it hardly needs thinking about at all. Yet, put together the two features conflict. Most calculators ignore the resulting feature interaction, with the result that the users will perform incorrect calculations without warning—and generally without noticing.

    We want to allow users to make and correct errors, but that obviously desirable feature interacts with the equally obvious feature that only correct numbers are displayed. What should happen when a user keys in a syntactically incorrect number? How can it be corrected? This is another feature interaction.

    Another feature interaction is the way negate works combined with the way delete works. The negate ( 

    What happens when a red blood cell dies?
     ) key changes the sign of a number and on most number entry systems it can be pressed at any time as a numeral is entered, in part because it is always represented as a prefixed minus sign whenever it is pressed. Thus, on the iOS calculator and many other devices with a change sign key, all of the following (pressed after
    What happens when a red blood cell dies?
    so they all start in the ‘ground state’):
    What happens when a red blood cell dies?
    ;
    What happens when a red blood cell dies?
    ; and
    What happens when a red blood cell dies?
    all result in −12 being displayed.

    Delete seems simple, but is an interesting feature. In general, one would expect that a sequence of keystrokes …ki−2,ki−1,ki followed by

    What happens when a red blood cell dies?
    would be equivalent to …ki−2,ki−1. That is, the most recent keystroke ki ‘disappears’ when
    What happens when a red blood cell dies?
    is pressed. Put into formal terminology, ki
    What happens when a red blood cell dies?
    is the identity (for ki≠
    What happens when a red blood cell dies?
     ), or it should be expected to be.

    However, a feature interaction between change sign and delete arises. What happens when

    What happens when a red blood cell dies?
    is pressed is unpredictable—it may or may not delete the sign prefix. The problem, of course, is that change sign key is implemented to put the negative sign at the left of the numeral, yet the delete key works on the right-most end of the numeral. After a change sign, the delete key cannot work both at the end and at the start of a numeral, and that causes an unfortunate feature interaction. A simple solution would be for the delete key to work on the most recent key pressed, not on the right-most key shown—but this is a little harder to implement.

    The

    What happens when a red blood cell dies?
    feature interaction is caused by premature semantics: treating an interactive numeral as a simple signed number, and then not having any representation of what the user has done so that
    What happens when a red blood cell dies?
    can be disambiguated.

    If a numeral is ambiguous or unreadable (or inaudible if spoken, e.g. by synthesized speech) it is unreliable.

    Legibility seems a self-evident criterion for a good interactive numeral, but as the Apple MacOS app example shows (figure 1c) it is easy to overlook. Other number entry systems routinely truncate numerals to fit displays, causing potential problems that are very hard to see. Elsewhere we have discussed legibility in more detail [25], and in particular, criticized poor choice of numeral fonts, such as seven segment displays.

    Legibility ought to be an obvious requirement, and achieving it (particularly with today’s high-resolution displays and quality audio outputs) is completely unproblematic. Yet poor legibility continues to be a factor in accidents and other problems [25]—note that this reference gives brief details of a fatal plane crash caused by poor numeral legibility as well as many recommendations for improving numeral legibility more generally.

    Sooner or later, users make errors, and they may or may not be aware of their errors [17]. We have shown elsewhere [20] that if interactive numeral systems detect syntax errors (such as two decimal points) the effective error rate can be reduced; for instance, the ‘out by 10’ (the number entered is 10 times too large or too small) error rate can be halved.

    In practice, interactive number entry systems rarely detect user error. We gave examples above of a user entering numbers that are too large: the results are wrong—yet detecting this error is trivial programming. In some systems, user error is ironically treated as a ‘feature’. For example, on the Graseby 3400 infusion pump (a drug delivery device) the decimal point is treated as a special case of

    What happens when a red blood cell dies?
    : it deletes the decimal part of a number, so
    What happens when a red blood cell dies?
    enters 3.5, whereas many other systems (like the Apple iOS and MacOS pane but not the MacOS app) will just ignore additional decimal points, even though they key-click, misleadingly confirming the keys are being processed normally.

    Displaying NaN is a simple example of ignoring error. NaN is displayed when the program attempts to perform an invalid operation on a number: the hardware (or the virtual machine) has detected the error that the programmer has ignored (figure 1b).

    Unlike the other Apple calculators, the MacOS app allows the user to enter arbitrarily large numbers, which as we explained above get displayed in a smaller and smaller font until they are unreadable. Continuing using the app eventually makes an erroneous calculation and displays NaN. Presumably, as each digit, say

    What happens when a red blood cell dies?
    , is pressed the interactive numeral processing calculates d′=10d+9, updating the displayed number d to d′, but eventually 10d or 10d+9 is too large for the underlying floating point representation to handle. A correct program would check d was in range before attempting an invalid calculation—even better, it would also check d′ would be displayed correctly (which none of the Apple calculators do). Even displaying the resulting NaN to the user is another ignored error.

    All the three Apple calculators allow pasting in a number. For example, 1,2,3.4.5 becomes 123.4 on the iOS and pane calculators, but 123.45 on the app. None of the calculators detect the errors. Ignoring the commas ‘makes sense’ only because it is an easy way to handle correct numbers like 10,000. But the same reasoning also ignores problems with erroneous numerals like 10,00. Ignoring errors, including discarding digits and decimal points, is very unhelpful—users expect correct answers, not any old answers.

    Many systems, however, do detect use error; for example, one of many studies using ‘smart pumps’ (interactive drug infusion devices with number range error detection) along with other interventions reduced error rates by 73% with paediatric patients [28]. Given the obvious benefits and effectiveness of error detection, it is, then, very surprising that interaction error is so widely ignored: it is clearly not a problem with reporting error as such (which is how so-called smart pumps work), but of bothering to program systems to detect errors during the process of interactive numeral entry.

    Ignoring error is a special case of poor programming, which we discuss next.

    Poor programming explains many problems.

    We can describe the programming problem succinctly: user operations on numeral syntax define an abstract date type for data entry. The continual mapping of that abstract data type to computer number representations (e.g. floating point numbers) as the user enters and edits numbers becomes a mess if not programmed correctly.

    Type checking helps detect inconsistencies within a program and helps improve program quality generally, but it does not specifically help improve interaction. To make the point, figure 6 illustrates problems that can arise without type checking. Clearly, some possible interaction design defects can be completely prevented by using a typed language, however type checking alone is not sufficient to correctly manage problems of the user entering ‘numerals’ that are not even parsable, as illustrated in the figure.

    What happens when a red blood cell dies?

    Figure 6. A user might enter the data ‘I don’t know’ into a numeric field on a web site programmed in JavaScript. JavaScript is a very popular web language but it is not type-checked. JavaScript will happily compare the non-numeric string the user has entered with the number zero, without any errors at compile time or run time. Since what the user entered is neither negative nor positive, the program will deduce it must be zero, which of course it is not. The error may then propagate through the program and cause further problems. In a type-checked language, strings and numbers are not comparable and a program like this would not compile, so the programmer would have to develop a ‘type correct’ program before it could be used.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    When interacting with programs, users can do anything, and a dependable interactive program has to respond coherently to all possible user interactions. Interestingly, neither the philosophies of static or dynamic type checking align well with this requirement: incorrect types cause incorrect programs to be rejected, and the programmer has to start again after fixing the errors. While that makes sense and is undeniably a very useful approach for dependable programming, by contrast, in user interface interaction rejecting the user’s input is rarely a good option—an interactive program has to keep running despite error, and it has to manage error as the user gradually repairs the input to something acceptable. Even if it may be a good idea for the user to ‘start again’ (e.g. when entering a quasi-numeral as an incorrect security code) the program cannot terminate.

    Very often there are unhelpful compromises: the user can make an error, but their input will not be saved (or otherwise processed) until all the data are ‘correct’. This is essentially the type correct part of the program refusing to allow the user to continue interaction until all detected errors are fixed. If the user needs to pause while they decide how to proceed, many programs will further aggravate the situation by timing out and discarding the user’s partial input. Put in other words, the problem is the ‘correct’ program has required numbers prematurely, but the user interface is supporting interactive numerals—and the program is expecting the user only to save when all interactive numerals are well-formed numerals (and similarly for other types of data, not just numerals). As explained elsewhere in this paper (§5.1), this is a premature semantics.

    Users learn how to enter numbers on systems and will acquire habits (such as shortcuts and specific ways to use the correction features). Thus users will internalize the ‘mess’ and if the mess is different on different systems, this will induce unnecessary errors (so-called transfer errors). The unnecessary variation, despite their very similar look and feel, between the three Apple calculators is a case in point.

    Examples earlier showed large numbers may be accurately displayed when they are the result of a calculation, but not when the user enters them. Thus, although numbers as such are handled well, interactive numerals are not. Worse, there is usually no warning to the user that displayed values are incorrect when numbers entered by the user have been truncated.

    The problems with large numbers probably happen because a standard, general-purpose, number output routine is used to display calculated values, but user-input numbers are displayed differently: a basic programming inconsistency. Considering the additional inconsistencies over cut and paste discussed above, it is likely that those calculators do not use standard text entry routines, but handle button presses in ad hoc ways.

    We conclude that correct interactive numeral programming is harder than most people think—poor programming is deceptively easy. The cognitive effort to program well creates tunnel vision: that is, programmers make mistakes they are unaware of [29]. Modern software engineering techniques like formal methods and code review must be used to mitigate these predictable problems, and evaluation techniques must be used to evaluate whether designs meet their requirements when they are used.

    The evidence presented in this paper suggests even major manufacturers are not using such techniques to help avoid problems. We would argue that any computer system programmed by people who do not use such techniques [30] should not be used for any critical application.

    What happens when a red blood cell dies?
    Some readers of this paper have pointed out that some programming languages, such as Haskell and Swift (and more generally, many programming languages that have strong typing), have built-in mechanisms to help avoid premature semantics. However, the problems this paper has pointed out can be avoided in any programming language. The issue is not the programming language used, or the programming language that might have been used, or even whether the programmer uses data validation. The issue is the lack of knowledge about interactive numerals.

    Interactive numerals are more complex than most programmers think, so they implement them simplistically. Even the world’s very best programmers are not immune. If programmers are unaware of premature semantics and feature interaction, they will make mistakes, which will then catch out users later. Understandably, users are unaware of these issues and they risk being misled, particularly in situations of error—precisely, then, at the very times when being misled is especially counter-productive.

    This paper makes several important points:

    • (i) Our definition of interactive numerals is essentially equivalent to a high-level specification that programmers could implement. Variations from this specification would allow programmers to identify with clarity any design compromises, and so reason about the consequences of their compromises in their approach to interactive numerals.

    • (ii) The everyday practice of programming has not yet advanced to the point where interactive numerals are reliably implemented, nor to a point where users can reliably use them. We have developed some evaluation tools [21] that can rapidly highlight problems and help select safer designs, but formal methods should also be used to avoid problems in the first place.

    • (iii) Premature semantics is a concept describing a common type of defect in programming, and is particularly prevalent in poor interactive numeral implementations. Now named, it can be actively managed by programmers.

    • (iv) Problems with interactive numeral implementations mean that logs of interactive system usage may record what systems do but do not reliably record what was done by users. This hinders accurate understanding of any use problems, and may have legal repercussions: logs must not be naïvely used as evidence of what users have or have not done [31]—they only show what the system records after premature semantics has over-simplified it. Logs do not give any reliable insight into what users did (let alone why [31]), unless there is rigorous evidence of correct design and operation, including appropriate forensic procedures (e.g. digital signatures) to confirm log data is not contaminated by poor programming or cyber attacks.

    • (v) Interactive numerals can be discussed rigorously because numeral syntax and number semantics are familiar and are well-defined, precise concepts. It is, therefore, easy (if one asks the right questions) to show whether they are implemented correctly. There are, however, many other forms of interactive features that beg further study. For example, typing letters with accents (e.g. ï, é, ç, as well as keys that are not on some keyboards like æ, ß, ł, etc.) requires several keystrokes; what then should key sequences like

      What happens when a red blood cell dies?
      do? Indeed, the synchronization of the user model and the program’s semantic model at the keystroke level is a topic that has been excluded from classic research [32].

    Being able to buy and use safe interactive numeral systems, let alone assure yourself they are as dependable as they seem, remains problematic. For the time being, in safety and mission critical areas users must compensate for poor design by adopting strategies (such as repeating calculations entered in different ways) to help detect error. It was beyond the scope of the present paper to discuss human factor mitigations, such as range validation and the user interface redisplaying numbers in contrasting numeral formats (e.g. spoken words) to help the user confirm that the number they intended to enter was indeed correctly entered.

    With this paper, we hope we have clarified the fundamental role of interactive numerals in numerical interactive systems—that is, almost all interactive systems. From articulating the design and use errors around interactive numerals, our examples show that users, designer and procurers need to pay very close attention to their correct implementation. And implementations more consistent with our definitions are demonstrably possible. For instance, we can indicate contingent semantic correctness [10,33] or systematically avoid premature semantics [34–36]. But these are entirely new approaches for users and designers of number entry systems. They may bring new, different problems and there are most probably even better, more reliable, more analysable implementations. However, having identified now the conceptual target of interactive numerals, we are in a better position to work towards the engineering of more dependable interactive systems. The extra effort needed has a very high leverage because, for many applications, getting interactive numerals right will help millions of users over the lifetime of the improved products.

    All experimental data and results are fully described in the paper.

    The authors jointly wrote this paper, both making substantial contributions to the research, analysis and interpretation. Both approve the final version.

    The authors have no competing interests.

    H.T. was funded by Engineering and Physical Sciences Research Council (grant no. EP/L019272/1).

    The authors are very grateful for comments from Rod Chapman, Martyn Thomas and John Tucker. Ben Shneiderman prompted the thinking that led to this exposition of interactive numerals.

    Footnotes

    1 ‘Radix point’ is the generic term for decimal point, without regard for the base of the numeral. This paper generally uses the term ‘decimal point’ because of its familiarity, but without implying any loss of generality.

    2 Excel’s (v. 15.30 for OSX) PRODUCT calculates the product of a set of numbers. Curiously, if the set only contains 5.5 kg, etc., the product is zero, but if there are ‘conventional’ values as well in the set, like 32, then the PRODUCT operation treats 5.5 kg as 1.0. The documentation for PRODUCT is incorrect too: ‘You can also perform the same operation by using the multiply (*) mathematical operator; for example, =A1*A2.’ But * does not implement the same operation, as it gives #VALUE! when PRODUCT behaves as described above. This suggests that Microsoft has lost track of the semantics of their interactive numerals.

    3 Similar problems are ubiquitous across many devices and applications. In this paper, we present some initial examples from Apple calculators, though figure 2 shows problems are not limited to Apple. Apple have a leading reputation for high quality user interfaces, and also develop across several distinct and widely-available platforms. We provide version numbers so results can be easily reproduced.

    4 E and e are commonly used interchangeably in decimal numerals in programming languages to mean power of 10, so programmers may already be comfortable with the notation, but this does not excuse carrying over a programming notation into a general-purpose user interface notation, especially when it so obviously creates knock-on problems for the user. Most users are not programmers.

    5 Obviously the decimal value 0.1 can be represented using structured binary values, such as rationals (e.g. as two binary numbers, representing 1 divided by 10) or as binary coded decimal, and in many other ways.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1
    • 2

      Thimbleby H. 2011Interactive numbers—a grand challenge. In Proc. of IHCI 2011: IADIS Int. Conf. Interfaces and Human Computer Interaction 2011 (ed. K Blashki), pp. xxviii–xxxv. International Association for the Development of the Information Society. Google Scholar

    • 3

      Gustafson JL. 2015The end of error. Boca Raton, FL: CRC Press. Google Scholar

    • 4

      Thimbleby H. 2000Calculators are needlessly bad. Int. J. Hum. Comput. Stud. 52, 1031–1069. (doi:10.1006/ijhc.1999.0341) Crossref, ISI, Google Scholar

    • 5

      Oladimeji P, Masci P, Curzon P, Thimbleby H. 2015Issues in number entry user interface styles: recommendations for mitigation. In Proc. 5th EAI Int. Conf. on Wireless Mobile Communication and Healthcare, pp. 4–7. Ghent, Belgium: European Alliance for Innovation. Google Scholar

    • 6
    • 7

      Thimbleby H. 2007Interaction walkthrough: evaluation of safety critical interactive systems. In Proc. the XIII Int. Workshop on Design, Specification and Verification of Interactive Systems—DSVIS 2006 (eds G Doherty, A Blandford). Lecture Notes in Computer Science, vol. 4323, pp. 52–66. Berlin, Germany: Springer. Google Scholar

    • 8

      Cauchi A, Gimblett A, Curzon P, Masci P, Thimbleby H. 2012Safer ‘5-key’ number entry user interfaces using differential formal analysis. In Proc. BCS Conf. on HCI, vol. XXVI, pp. 29–38. Oxford, UK: Oxford University Press. Google Scholar

    • 9

      Thimbleby H. 2008Ignorance of interaction programming is killing people. ACM Interact. 15, 52–57. (doi:10.1145/1390085.1390098) Crossref, Google Scholar

    • 10

      Thimbleby H. 2015Safer user interfaces: a case study in improving number entry. IEEE Trans. Softw. Eng. 41, 711–729. (doi:10.1109/TSE.2014.2383396) Crossref, ISI, Google Scholar

    • 11

      Lewis A, Williams J, Thimbleby H. 2015Making healthcare safer by understanding, designing and buying better IT. Clin. Med. 15, 258–262. (doi:10.7861/clinmedicine.15-3-258) Crossref, PubMed, ISI, Google Scholar

    • 12

      Cohen D. 2012How a fake hip showed up failings in European device regulation. Br. Med. J. 345, e7090. (doi:10.1136/bmj.e7090) Crossref, PubMed, Google Scholar

    • 13

      Olsen KA. 2008The $100,000 keying error. IEEE Comput. 41, 106–107. (doi:10.1109/MC.2008.135) Crossref, ISI, Google Scholar

    • 14

      Champion M. 2017This is what it’s like to be wrongly accused of being a paedophile because of a typo by police. Buzzfeed Newa. See buzzfeed.com. Google Scholar

    • 15

      Oladimeji P, Cox A, Thimbleby H. 2011Number entry interfaces and their effects on errors and number perception. In Proc. IFIP Conf. on Human-Computer Interaction, pp. 178–185. Laxenburg, Austria: International Federation of Information Processing (IFIP). Google Scholar

    • 16

      Soboczenski F, Hudson M, Cairns P. 2015The effects of perceptual interference on number-entry errors. Interact. Comput. 28, 208–218. (doi:10.1093/iwc/iwv034) ISI, Google Scholar

    • 17

      Oladimeji P, Cox A, Thimbleby H. 2013A performance review of number entry interfaces. In Proc. of IFIP Conf. on Human-Computer Interaction, pp. 365–382. Laxenburg, Austria: International Federation of Information Processing (IFIP). Google Scholar

    • 19

      Moore C. 2017Medical radiological incidents: the human element in complex systems. In 25 at 25: A selection of articles from twenty-five years of the SCSC newsletter safety systems (eds G Jolliffe, M Parsons, T Kelly), pp. 25–30. Safety-Critical Systems Club. Newcastle upon Tyne, UK: Centre for Software Reliability. Google Scholar

    • 20

      Thimbleby H, Cairns P. 2010Reducing number entry errors: solving a widespread, serious problem. J. R. Soc. Interface 7, 1429–1439. (doi:10.1098/rsif.2010.0112) Link, ISI, Google Scholar

    • 21

      Thimbleby H, Cairns P, Oladimejii P. 2015Unreliable numbers: error and harm induced by bad design can be reduced by better design. J. R. Soc. Interface 12, 20150685. (doi:10.1098/rsif.2015.0685) Link, ISI, Google Scholar

    • 22

      Hopper A, McCanny J. 2016Progress and research in cybersecurity supporting a resilient and trustworthy system for the UK. London, UK: Royal Society. Google Scholar

    • 23

      Blandford A, Cox A, Curzon P, Thimbleby H. 2016Research manifesto. See http://www.chi-med.ac.uk/insights. Google Scholar

    • 25

      Thimbleby H. 2013Reasons to question seven segment displays. In Proc. ACM Conf. Computer-Human Interaction, pp. 1431–1440. New York, NY: ACM. Google Scholar

    • 26

      Calder M, Kolberg M, Magill EH, Reiff-Marganiec S. 2003Feature interaction: a critical review and considered forecast. J. Comput. Netw. 41, 115–141. (doi:10.1016/S1389-1286(02)00352-3) Crossref, ISI, Google Scholar

    • 27

      Pashler H. 1994Dual-task interference in simple tasks: data and theory. Psychol. Bull. 116, 220–244. (doi:10.1037/0033-2909.116.2.220) Crossref, PubMed, ISI, Google Scholar

    • 28

      Larsen GY, Parker HB, Cash J, O’Connell M, Grant MC. 2005Standard drug concentrations and smart-pump technology reduce continuous-medication-infusion errors in pediatric patients. Pediatrics 116, e21–e25. (doi:10.1542/peds.2004-2452) Crossref, PubMed, ISI, Google Scholar

    • 29

      Thimbleby H. 2016Human error in safety-critical programming. In Developing safe systems, Proc. of the 24th Safety-Critical Systems Symposium (eds M Parsons, T Anderson), pp. 183–202. Newcastle upon Tyne, UK: Center for Software Reliability. Google Scholar

    • 31

      Dekker SWA. 2001The disembodiment of data in the analysis of human factors accidents. Hum. Factors Aerosp. Saf. 1, 39–57. Google Scholar

    • 32

      Card SK, Moran TP, Newell A. 1983The psychology of human-computer interaction. London, UK: L. Erlbaum Associates Inc. Google Scholar

    • 33

      Thimbleby H, Gimblett A. 2011Dependable keyed data entry for interactive systems. Electron. Commun. EASST 45, 1/16–16/16. (doi:10.14279/tuj.eceasst.45.642) Google Scholar

    • 34

      Thimbleby H. 1995A new calculator and why it is necessary. Comput. J. 38, 418–433. (doi:10.1093/comjnl/38.6.418) Crossref, ISI, Google Scholar

    • 35

      Thimbleby W. 2004A novel pen-based calculator and its evaluation. In Proc. Nordic Conf. on Human-Computer Interaction, pp. 445–448. New York, NY: ACM. Google Scholar

    • 36

      Cairns P, Wali S, Thimbleby H. 2004Evaluating a novel calculator interface. In Proc. British Computer Society HCI Conf. (eds A Dearden, L Watts), vol. 2, pp. 9–12. Research Press International. Google Scholar


    Page 3

    In the digital world, one of the major and essential issues is to protect the secrecy of confidential data during their transmission over a public channel. In general, the confidential digital data are pre-processed before their transmission over a public channel. This pre-processing operation changes the content of the information into another form, but only an authorized person is capable of appropriately executing the reversible operation on the modified data to retrieve the original content. Several data protection techniques have been devised to protect the confidentiality of digital data. Cryptography [1] is one of the popular techniques used for the secure communication of confidential data. Since the data encryption technique produces a stream of meaningless code for transmission, it may attract an intruder to alter the message intentionally or to retrieve the message by exploiting various cryptographic attacks on the encrypted data.

    In contrast, steganography [2] is another mechanism to protect the secrecy of the data. It does not alter the data to make it meaningless to the intruder. In this mechanism, the secret data are embedded into any other unsuspected carrier or cover media like image, audio, video etc. to form a meaningful message that is known as stego-media. It is difficult to distinguish the stego-media from the original cover media by human visual perception. Hence compared with cryptography, the steganographic process prevents an unintended recipient from suspecting that secret data are being transmitted over a public channel through meaningful cover media. A steganography-based security system is used in various applications like military communication, commercial enterprises, Internet of Things and multimedia [3–5]. Several combined cryptographic and steganographic schemes [5,6] are found in the literature. Although the aim of both the cryptographic and steganographic schemes is to ensure data security, the combined approach of cryptography and steganography enhances the security system further with increased computational overhead. So these two security mechanisms, i.e. cryptography and steganography, are exploited distinctly in the field of information security.

    Image data are frequently used in various applications. In the literature, a number of image-based steganographic schemes are found to share confidential digital data in a secure way. Among them, the least significant bit (LSB) substitution method [2] is one of the widely used methods due to its simple embedding process and high hiding capacity. In this approach, the least significant bits of the cover pixels are replaced by the secret message sequentially. It has been observed that, up to three bits, LSB [2] replacement is suitable to retain a reasonably good quality stego-image along with high embedding payload. The visual quality of an LSB-based stego-image can be improved further by an optimal pixel adjustment process [2]. Some other LSB substitution-based improved steganographic schemes are found in the literature like a novel scheme proposed by Yang [7]. In that scheme, instead of modifying the cover pixel directly, the secret message bits are inverted and the inverted information known as inverted patterns are recorded for the purpose of extraction of the secret message. Later, Chen [8] suggested an efficient scheme which has improved the visual quality of the stego-image using LSB substitution along with the modulus function approach. In this scheme, the repetition of the secret message is considered for reducing the distortion that occurs in the stego-image. Recently, Xu et al. [9] proposed an improved LSB substitution scheme which works on modulo three strategies. The LSB-based steganographic scheme has fixed payload capacity. To improve the payload further, several researchers [10–12] have proposed edge-based steganographic schemes. In natural images, it has been noted that the modification in the smooth region is easily noticeable by human visual perception, and hence hiding more message bits in the edge region is preferred. Such a technique is also proposed by Chen et al. [10], where they have developed an edge-based image steganographic scheme where the edge pixels are identified by the combination of the fuzzy edge detector and the canny edge detector, and subsequently, the more secret message bits are embedded in the edge region, rather than the non-edge region, using the LSB method. The combination of the fuzzy edge detector and the canny edge detector has effectively increased the number of edge pixels; as a result, the embedding capacity is high in their proposed scheme. In [11], the authors have distributed image pixels into two categories, i.e. edge pixels and non-edge pixels. A larger number of secret bits are embedded into each edge pixel compared with the non-edge pixels. They have improved the payload capacity but they have compromised with minute visual distortion that occurs in the stego-image. To preserve the high visual quality of the stego-image, the proposed scheme of Islam et al. [12] has concealed the secret message bitstreams only at the edge region. In their scheme, the cover image is pre-processed, so that the edge region will be the same even after the embedding of secret message bits. The edge region of the pre-processed cover image is located by a suitable threshold value and that was considered as a stego key. This process enhances the security level further.

    Apart from the LSB substitution method, another kind of steganographic scheme was proposed by Wu & Tsai [13] where the secret message was hidden by comparing the differences between the intensity values of two successive pixels. Their method is known as pixel-value differencing (PVD) and it is widely used in the data-hiding field. This method computes the intensity value difference of two consecutive pixels and the hiding capacity is determined based on the pixel value differences. Hence in the PVD technique, more data can be embedded in the edge region in comparison with the smooth region. However, in the smooth region, the hiding capacity is less compared with that of the LSB substitution method. So Khodei & Faez [14] have suggested a combination of LSB and PVD methods where three consecutive pixels are considered in hiding the secret message. Their scheme has improved the embedding capacity and retained the acceptable visual quality of the stego-image. Several other PVD variants [3,15–20] are found in the literature for enhancing the PVD technique. Lee et al. [3] have introduced a tri-way PVD approach to improve the hiding capacity and to survive against several steganalyses. Tseng & Leng [15] have modified the traditional PVD-based quantization range table and introduced a new technique known as perfect square number (PSN). The secret message bits are concealed using the PSN and their proposed quantization range table. Liao et al. [16] proposed four-pixel differencing and a modified LSB substitution-based steganographic scheme. The edge region pixel is able to tolerate extensively more changes without perceptual misrepresentation than the smooth region. Swain [17] proposed another combination of LSB- and PVD-based improved image steganographic schemes where the secret message bits are hidden into 2 × 2 pixel non-overlapping blocks of a cover image. Recently, another block-based PVD steganographic scheme was presented in [18], where they have considered 3 × 3 non-overlapping image blocks. A seven-directional PVD scheme [19] is found in the literature with improved payload capacity. Conventional PVD suffers from a falling-off boundary problem in some blocks. Hence after the readjustment process, the distortions of those blocks are high when compared with the other blocks. It is of concern that sometimes it provides a low quality of stego-image. Some authors have addressed this problem and their solutions are effective with intensive computational overhead. Zhao et al. [20] proposed PVD with modulus function for improving the image quality while preserving the same embedding capacity as found in conventional PVD. Another work is found in [21] where the authors overcome the falling-off boundary problem by adopting the adaptive PVD approach.

    Several researchers have employed either LSB substitution or a PVD-based steganographic approach to devise some efficient colour image steganographic schemes. In [22], the authors have enhanced the security of the colour steganographic scheme where they have not concealed secret message bits in sequential order into each colour pixel. The embedding process is realized based on a secret pseudorandom value which decides adaptively the payload capacity and the sequence of embedding secret message bits into each colour plane. Their indirect approach definitely enhances the security level. Another LSB substitution-based colour image steganography is found in [23] where the secret message bits are hidden with reference to an indicator colour plane instead of directly embedding the secret message bits in order. Another secret key-based colour image steganography is suggested by Parvez & Gutub [24] where the secret message bits are spread out over each colour plane based on some predefined secret key. A modified PVD-based steganography is proposed by Nagaraj et al. [25]. In their scheme, they used modulus 3 function with PVD for realization of secret message bits into colour pixels. Later, Prema & Manimegalai [26] proposed a colour image steganography using modified PVD. In their scheme, an RGB colour image is decomposed into non-overlapping blocks of two consecutive pixels. Three different pairs, namely (R,G), (G,B) and (B,R), are formed from two consecutive colour pixels and the secret message is embedded based on differences of colour component pairs. They have improved the hiding capacity while maintaining acceptable visual quality of the stego-image. Yang & Wang [27] devised a block-based smart pixel adjustment process where a block of two colour pixels is considered during the secret message-embedding process. However, in their scheme, hiding capacity is not excessive. Adaptive PVD-based colour image steganography is suggested in [28] where the secret message is concealed in the block level of each colour plane. The vertical and horizontal edges are exploited in each block during the message-embedding process. The above colour image steganographic schemes basically work on a colour plane instead of on colour pixels. Hence in this paper, we have proposed an RGB colour image steganography, where the secret message is concealed into each colour pixel independently. The proposed scheme chooses a colour pixel at a time and embeds the secret message into each colour pixel individually by employing the modified PVD appropriately. In the proposed scheme, the colour pixel is grouped into two pairs, namely (R,G) and (G,B), to form two overlapping blocks. PVD is applied to each pair, for embedding the secret message bits. Afterwards, the proposed readjustment process is carried on each pair to obtain the final modified stego colour components, i.e. R, G and B components. The proposed readjustment process ensures that, in the decoding process, PVD is applicable to extract the secret message bits from the stego colour pixel. The proposed scheme will improve the embedding capacity due to consideration of overlapping block concepts.

    The rest of the paper is organized as follows. Section 2 presents the basic idea of the PVD method. The details of the proposed scheme are described in §3. The experimental results are presented in §4. Finally, §5 concludes the paper.

    The PVD method [13] uses grey-level images as the cover image and variable-sized secret message bit sequences are embedded into the cover image. Fewer secret message bit sequences are embedded into the smooth region compared with the edge region. Initially, the cover image is partitioned into non-overlapping blocks of size 1 × 2 in raster scan order. Two consecutive pixels in the ith block are denoted as Pi and Pi+1, respectively. The difference value, di, between two consecutive pixels is calculated by di = |Pi − Pi+1|. The absolute value of di denotes the variation present in each block. A small value of di suggests the presence of a smooth region, whereas a larger value indicates the presence of the edge region. The possibility is that di belongs to the range of [0, 255] when the greyscale image consists of 256 intensity values. The di value can be quantized into several regions as shown in figure 1. The lower and upper bound of each Ri is denoted by [loweri upperi]. The number of embedded secret bit sequences (t) in two consecutive pixels depends on the quantization range table and it is computed as t=⌊log2⁡(upperi−loweri+1)⌋. The obtained bit sequence is converted into decimal value, td. The new difference value (di′) is obtained by di′=td+loweri.

    What happens when a red blood cell dies?

    Figure 1. The quantization range.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The modified pixel values are computed based on the following condition:

    (Pi′,Pi+1′)={(Pi+⌈m2⌉,Pi+1−⌊m2⌋), if Pi≥Pi+1 and di′>di(Pi−⌊m2⌋,Pi+1+⌈m2⌉), if Pi<Pi+1 and di′>di(Pi−⌈m2⌉,Pi+1+⌊m2⌋), if Pi≥Pi+1 and di′≤di(Pi+⌈m2⌉,Pi+1−⌊m2⌋), if Pi<PI+1 and di′≤di},2.1

    where m=|di′−di|.

    In this method, ith block pixels Pi and Pi+1 will be replaced by the stego pixels Pi′ and Pi+1′. After the embedding process, the receiver side will compute the difference of the ith block di′=|Pi′−P′i+1|. The difference di′ is used to search for the number of concealed bitstreams in the ith block using the quantization range from figure 1. The secret bitstreams are obtained after converting the decimal value of (di′−loweri) into binary form. An example of the PVD process is illustrated below.

    We illustrate the embedding procedure in figure 2 with a pair of two consecutive pixels 102 and 120 from a cover image. Compute d = |120 − 102| = 18 and the lower and upper ranges are searched from figure 1. The difference value, d = 18, belongs to the region R3, with the corresponding lower = 16 and upper = 31. The number of secret message bits is decided based on

    t=⌊log2(31−16+1)⌋=4 bits.

    Suppose the 4 bits binary secret message is 10112 and its corresponding decimal value is 1110. The modified difference and m are calculated as follows:

    di′=loweri+Secret message (Decimal)di′=16+11=27m=|di′−di|=|27−18|=9.

    Finally, as per equation (2.1), the stego pixels will be computed as follows:

    (Pi′,Pi+1′)={(102−4,120+5),if 102<120 and 27>18}(Pi′,Pi+1′)=(98,125).

    What happens when a red blood cell dies?

    Figure 2. Embedding procedure in the PVD approach.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The above example is graphically represented in figure 2. In the extraction process, the difference d′ = |98–125| = 27 and it belongs to region R3. The number of embedded secret bits is computed based on the lower and upper value of R3 where t=⌊log2(31−16+1)⌋=4 bits. The decimal value of the secret message is (d′ − lower) = 27 − 16 = 11 and the corresponding 4 bits binary representation is 10112.

    The proposed colour image steganographic scheme is presented in this section. Initially, each colour pixel is decomposed into its corresponding colour components, i.e. R, G and B. Later we have formed two pairs with a combination of (R,G) and (G,B). Other ordered pairs are also acceptable, but in this work, we have implemented our scheme using the pairs like (R,G) and (G,B). (R,G) and (G,B) will form two consecutive overlapping blocks as shown in figure 3. In our scheme, we have embedded the variable secret message bits based on the difference of each pair using PVD. After embedding the secret message bits into each pair, the intermediate colour components are further readjusted to attain the final stego-colour components. A natural colour image may be dominated by particular colour components as an outcome of the data hiding process of that particular pixel, and the distortion may be large enough to be perceived. In this paper, we have avoided this circumstance by adopting a suitable threshold value. The data-hiding capacity in each colour pixel is restricted by the threshold value, so that the stego-image may retain high visual quality. Figure 4 shows the overall embedding process. The decoding process is shown in figure 5. The algorithm steps of the proposed embedding and extraction procedure are presented as follows:

    What happens when a red blood cell dies?

    Figure 3. RGB pixels block of colour image.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 4. A schematic diagram of data-embedding procedure.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 5. A schematic diagram of data extraction procedure.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    An illustration of the secret message-embedding procedure is given in figure 6. Let the RGB colour components be 102, 120 and 130, respectively. We have taken a random secret message bitstream as 10111011100111…. The (R,G) pair has embedded the secret bitstream as 10112 based on the PVD approach. The other pair (G,B) has selected the secret message bits as 1012 based on the PVD approach. The stego colour components obtained according to our approach are 95, 122 and 135.

    What happens when a red blood cell dies?

    Figure 6. Example of the embedding procedure.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    In this section, the experimental results are presented to demonstrate the performance of the proposed scheme. The proposed scheme has been tested on a set of standard colour images, but in this paper, we present the results for six colour images where the images are selected with consideration of diverse image features to estimate the performance in terms of visual quality and embedding capability of the stego-images. The original images are shown in figure 7. The randomly generated message bits are considered as secret message bitstreams in our experiment. After the embedding process, the obtained stego-images are as shown in figure 8 and it is observed that the imperceptibility of stego-images is high. The histograms of the original cover image and stego-images are depicted in figures 9–20 and the plotted histograms reveal similarity between original and stego-images. Figures 9–20 suggest that, in our proposed scheme, the disparities occurring due to embedding of secret message bitstreams are not noticeable in the stego-image. In addition, the differences occurring in histogram levels are reasonably insignificant, as shown in figures 21–26. The stego-image quality is further estimated in terms of the peak signal-to-noise ratio (PSNR) and embedding capacity/payload. Table 1 gives the results of the proposed scheme in terms of embedding capacity and PSNR value. We have obtained high acceptable PSNR values for stego-images with a high embedding capacity of secret messages. Hence, in the proposed scheme, the PSNR values as well as the visual appearance of the stego-image and histogram suggest that the distortion appearing after embedding of the secret message into the cover image is reasonably less and imperceptible to human visual perception. The proposed scheme is also compared with some other steganographic schemes in terms of embedding capacity and PSNR, and their results are given in table 1. The experimental results indicate that the proposed steganographic scheme appropriately meets the requirements of steganography, where we have succeeded to embed a huge number of secret bitstreams while maintaining acceptable visual quality of stego-images.

    What happens when a red blood cell dies?

    Figure 7. Original cover images used in the experiment.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 8. Stego-images after data hiding.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 9. (a) Lena cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 10. (a) Lena stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 11. (a) Baboon cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 12. (a) Baboon stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 13. (a) Jet cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 14. (a) Jet stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 15. (a) Sailboat cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 16. (a) Sailboat stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 17. (a) Pepper cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 18. (a) Pepper stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 19. (a) Car–house cover image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 20. (a) Car–house stego-image. (b–d) Histograms of red, green and blue components.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 21. Lena difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 22. Baboon difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 23. Jet difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 24. Sailboat difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 25. Pepper difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 26. Car–house difference image histograms are (a) R, (b) G and (c) B.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 1.The simulation results.

    PVD method [13]Yang & Wang [27]Mandal & Das [21]Swain's proposed method 1 [28]proposed method
    cover image (512 × 512 × 3)capacity (bits)PSNR (dB)capacity (bits)PSNR (dB)capacity (bits)PSNR (dB)capacity (bits)PSNR (dB)capacity (bits)PSNR (dB)
    Lena1 234 39441.25196 60841.581 234 39440.211 341 19246.171 976 67131.01
    baboon1 406 40537.81196 60833.291 406 40537.141 489 94548.492 219 71532.29
    jet1 224 17840.44196 60843.731 224 17840.641 267 69046.181 753 70735.66
    sailboat1 289 87138.76196 60847.411 289 87139.351 424 96747.292 130 77233.11
    peppers1 236 71540.31196 60839.431 236 71540.371 350 25147.061 783 21030.10
    car–house1 263 03838.97196 60841.341 263 03839.621 339 98544.732 079 08834.59
    average1 275 76639.59196 60841.131 275 76639.551 369 00546.651 990 52732.79

    Most colour image steganography works on individual colour components instead of considering all colour components together. But in this paper, the proposed method conceals the secret message bits directly into each pixel sequentially. Conventional PVD works on the idea of overlapping blocks of colour components. The proposed readjustment process of colour components confirms the feasibility of conventional PVD-based decoding procedure. The experimental results reveal that the proposed scheme has a larger hiding capacity with acceptable imperceptibility of the stego-image. In addition, the proposed scheme is simple and easy to implement on RGB colour images.

    Our data have been deposited at Dryad (http://dx.doi.org/10.5061/dryad.21tm5) [29].

    Both authors contributed to the design and implementation of the research, and to the writing of the manuscript.

    We declare we have no competing interests.

    The authors express their gratitude to Indian Institute of Technology (ISM), Dhanbad, India, funded by the MHRD, Government of India.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Trappe W, Washington LC. 2011Introduction to cryptography with coding theory, 2nd edn. Delhi, India: Pearson Prentice Hall. Google Scholar

    • 2

      Chan CK, Cheng LM. 2004Hiding data in images by simple LSB substitution. Pattern Recognit. 37, 469–474. (doi:10.1016/j.patcog.2003.08.007) Crossref, ISI, Google Scholar

    • 3

      Lee YP, Lee J-C, Chen W-K, Chang K-C, Su I-J, Chang C-P. 2012High-payload image hiding with quality recovery using tri-way pixel-value differencing. Inf. Sci. 191, 214–225. (doi:10.1016/j.ins.2012.01.002) Crossref, ISI, Google Scholar

    • 4

      Al-Otaibi N, Gutub A. 2014Flexible stego-system for hiding text in images of personal computers based on user security priority. In Proc. Int. Conf. on Advanced Engineering Technologies (AET-2014), Dubai, UAE, pp. 250–256. Google Scholar

    • 5

      Das R, Das I. 2016Secure data transfer in IoT environment: adopting both cryptography and steganography techniques. In Proc. 2nd Int. Conf. on Research in Computational Intelligence and Communication Networks, Kolkata, India, pp. 296–301. Google Scholar

    • 6

      Zhou X, Gong W, Fu W, Jin L. 2016An improved method for LSB based color image steganography combined with cryptography. In 2016 IEEE/ACIS 15th Int. Conf. on Computer and Information Science (ICIS), Okayama, Japan, pp. 1–4. Google Scholar

    • 7

      Yang C-H. 2008Inverted pattern approach to improve image quality of information hiding by LSB substitution. Pattern Recognit. 41, 2674–2683. (doi:10.1016/j.patcog.2008.01.019) Crossref, ISI, Google Scholar

    • 8

      Chen S-K. 2011A module-based LSB substitution method with lossless secret data compression. Comput. Stand. Interfaces 33, 367–371. (doi:10.1016/j.csi.2010.11.002) Crossref, ISI, Google Scholar

    • 9

      Xu W-L, Chang C-C, Chen T-S, Wang L-M. 2016An improved least-significant-bit substitution method using the modulo three strategy. Displays 42, 36–42. (doi:10.106/j.displa.2016.03.002) Crossref, ISI, Google Scholar

    • 10

      Chen WJ, Chang CC, Le TH. 2010High payload steganography mechanism using hybrid edge detector. Expert Syst. Appl. 37, 3292–3301. (doi:10.1016/j.eswa.2009.09.050) Crossref, ISI, Google Scholar

    • 11

      Pal AK, Pramanik T. 2013Design of an edge detection based image steganography with high embedding capacity. LNICST 115, 794–800. (doi:10.1007/978-3-642-37949-9_69) Google Scholar

    • 12

      Islam S, Modi MR, Gupta P. 2014Edge based steganography on colored images. In Intelligent computing theories (eds DS Huang, V Bevilacqua, JC Figueroa, P Premaratne). Lecture Notes in Computer Science, vol. 7995, pp. 593–600. Berlin, Germany: Springer. (doi:10.1007/978-3-642-39479-9_69) Google Scholar

    • 13

      Wu D-C, Tsai W-H. 2003A steganographic method for images by pixel value differencing. Pattern Recognit. Lett. 24, 1613–1626. (doi:10.1016/S0167-8655(02)00402-6) Crossref, ISI, Google Scholar

    • 14

      Khodei M, Faez K. 2012New adaptive steganographic method using least significant bit substitution and pixel value differencing. IET Image Process 10, 667–686. (doi:10.1049/iet-ipr.2011.0059) Google Scholar

    • 15

      Tseng H-W, Leng H-S. 2013A steganographic method based on pixel-value differencing and the perfect square number. J. Appl. Math. 2013, 189706. (doi:10.1155/2013/189706) Crossref, Google Scholar

    • 16

      Liao X, Wen Q-Y, Zhang J. 2011A steganographic method for digital images with four-pixel differencing and modified LSB substitution. J. Vis. Commun. Image R 22, 1–8. (doi:10.1016/j.jvcir.2010.08.007) Crossref, ISI, Google Scholar

    • 17

      Swain G. 2016A steganographic method combining LSB substitution and PVD in a block. In Int. Conf. on Computational Modeling and Security (CMS 2016), pp. 39–44. Google Scholar

    • 18

      Hosam O, Halima NB. 2016Adaptive block-based pixel value differencing steganography. Secur. Commun. Netw. 9, 5036–5505. (doi:10.1002/sec.1676) Crossref, ISI, Google Scholar

    • 19

      Pradhan A, Sekhar KR, Swain G. 2016Digital image steganography based on seven way pixel value differencing. Indian J. Sci. Technol. 9. (doi:10.17485/ijst/2016/v9i37/88557) Crossref, Google Scholar

    • 20

      Zhao W, Jie Z, Xin L, Qiaoyan W. 2015Data embedding based on pixel value differencing and modulus function using indeterminate equation. J. China Univ. Posts Telecommun. 22, 95–100. (doi:10.1016/S1005-8885(15)60631-8) Crossref, Google Scholar

    • 21

      Mandal JK, Das D. 2012Steganography using adaptive pixel value differencing (APVD) of gray images through exclusion of overflow/underflow. In 2nd Int. Conf. on Computer Science, Engineering and Applications (CCSEA-2012), Delhi, India. Google Scholar

    • 22

      Al-Qahtani A, Tabakh A, Gutub A. 2009Triple-A: secure RGB image steganography based on randomization. In 7th ACS/IEEE Int. Conf. on Computer Systems and Applications (AICCSA-2009), Rabat, Morocco, pp. 400–403. Google Scholar

    • 23

      Gutub AA. 2010Pixel indicator technique for RGB image steganography. J. Emerg. Technol. Web Intell. (JETWI) 2, 56–64. (doi:10.4304/jetwi.2.1.56-64) Google Scholar

    • 24

      Parvez MT, Gutub AA. 2011Vibrant color image steganography using channel differences and secret data distribution. Kuwait J. Sci. Eng. (KJSE) 38, 127–142. ISI, Google Scholar

    • 25

      Nagaraj V, Vijayalakshmi V, Zayaraz G. 2013Colour image steganography based on pixel value modification method using modulus function. In 2013 Int. Conf. on Electronic Engineering and Computer Science, pp. 17–24. Google Scholar

    • 26

      Prema C, Manimegalai D. 2014Adaptive color image steganography using intra color pixel value differencing. Aust. J. Basic Appl. Sci. 8, 161–167. Google Scholar

    • 27

      Yang C-Y, Wang W-F. 2015Block based color image steganography using smart pixel-adjustment. Adv. Intell. Syst. Comput. 329, 145–154. (doi:10.1007/978-3-319-12286-1_15) Crossref, Google Scholar

    • 28

      Swain G. 2016Adaptive pixel value differencing steganography using both vertical and horizontal edges. Multimed. Tools Appl. 75, 13 541–13 556. (doi:10.1007/s11042-015-2937-2) Crossref, ISI, Google Scholar

    • 29

      Prasad S, Pal AK. 2017Data from: An RGB colour image steganography scheme using overlapping block-based pixel-value differencing. Dryad Digital Repository. (http://dx.doi.org/10.5061/dryad.21tm5) Google Scholar


    Page 4

    Protein–protein interactions (PPIs) play vital roles in several cellular processes, like signal transduction, cell proliferation, cell adhesion and apoptosis [1]. Many disease pathways including different stages of cancer development and host–pathogen interactions are associated with key PPIs [2]. Disruption of the crucially important PPIs is now thought to be a potential strategy to develop novel therapeutics [3]. Identification of hotspots at the interface or contact area of PPIs is now considered as a highly innovative and potential method to find newer drug targets [4–6]. The small chemical molecules that inhibit PPIs at their interfaces [7–9] are called PPI modulators (PPIMs). These PPIMs are very useful in designing drugs for various diseases including cancer. Though in silico identification of these compounds remains challenging in drug discovery, a few PPIMs have been identified and tested clinically in oncogenic studies. A few examples of small chemical PPIMs such as Nutlin-3a (Mdm2/P53) and ABT-263 and GX15-070 (Bcl2/Bak) were clinically tested [10–12]. Therefore, the interface areas of PPIs and identification of novel PPIMs which can inhibit an orthosteric region have been a central focus of many researchers.

    In this study, three well-known oncogenic PPIs, namely Mdm2/P53, Bcl2/Bak and c-Myc/Max, were chosen as the model system for identifying novel PPIMs. These three PPIs are transient in nature and critically play roles in cell growth or programmed cell death (apoptosis), indicating their involvement in cell proliferation. Indeed, a plethora of studies had established their role in different stages of cancer development. Mdm2 is a negative regulator of P53, a tumour suppressor protein. P53 regulates cell cycle and induces apoptosis in response to various stresses, particularly DNA damage, thereby preventing or suppressing tumour progression and/or development [4,12]. Bcl2/Bak is a homologous PPI complex that has opposite effects on cell death and proliferation. Bcl2 helps in cell survival, and Bak has a vital role in accelerating programmed cell death. The c-Myc/Max complex is a nuclear phosphorylated transcriptional activator and histone modifier inside the cell. This PPI also regulates the pathway of cancer [13–17].

    Public databases and literature report more than 17 000 non-redundant PPIMs [8,18]. The improvement in data extraction and management has aided in the identification of this huge number of compounds, which have been evaluated against different protein targets using various computational techniques. The advantage of this approach is that PPIMs can bind to many types of protein interfaces including orthosteric and allosteric sites, thus are often used as a starting point for PPI-targeting drug discovery programmes compared with other drug discovery strategies [19].

    In spite of the progress in PPIM drug discovery, the rate of success to find lead compounds in high-throughput screening techniques using these synthetic small molecules remains quite low. We have compiled a collection of known PPI inhibitors and used this dataset in machine learning methods. We present support vector machine (SVM)-based classifier prediction based web server with 10 standard physico-chemical properties/descriptors to build the optimal models for known PPIs like Mdm2/P53, Bcl2/Bak and c-Myc/Max [20]. The predicted SVM scores of training/testing datasets of Mdm2/P53 and Bcl2/Bak were compared with IC50 values and docking scores. Finally, the screened small chemicals from a large independent dataset from National Cancer Institute (NCI) were subjected to docking studies to find out a relationship between high and random predicted SVM scores with AutoDock Vina scores.

    The data of distinct small molecules (inhibitors) for three PPIs, Mdm2/P53, Bcl2/Bak and c-Myc/Max, were downloaded from TIMBAL and PubChem database. About 80% of total positive dataset was used as positive set for fivefold cross-validation, i.e. training/testing data. The positive datasets of Mdm2/P53, Bcl2/Bak and c-Myc/Max consisted of 250, 735 and 15 small molecules, respectively. PubChem BioAssay structure clustering (https://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?p=clustering) tool was used to make sure that the chemicals in training and testing set for all the three datasets are non-redundant. In the case of Mdm2/P53 and Bcl2/Bak, the negative sets were prepared by choosing 1040 random chemicals from PubChem and adding the other two positive set of PPIMs. For example, Bcl2/Bak and c-Myc/Max positive sets were included in Mdm2/P53 negative set along with 1040 random chemicals. In the case of c-Myc/Max, there were only 15 PPIMs in the positive set, so we have only taken random small chemical dataset as the negative set which is equivalent to 10 times the positive set. Therefore, the negative datasets of three PPIs (Mdm2/P53, Bcl2/Bak and c-Myc/Max) became 1790, 1305 and 150 molecules, respectively. The positive and negative set values are shown in electronic supplementary material, table S1a, and were further divided into five equal parts for fivefold cross-validation technique.

    PubChem BioAssay structure clustering tool was used to create the non-redundant positive datasets based on 90% and 80% structural similarity of the small chemical PPIMs. The positive datasets of 90% and of 80% Mdm2/P53 dataset reduced to 75 and 40 small chemicals, respectively and for Bcl2/Bak dataset were 185 and 100, respectively. For Mdm2/P53 and Bcl2/Bak, 1 : 1 (positive: negative) ratio datasets were created separately considering 0.99, 0.90 and 0.80 structural similarity threshold. Myc/Max dataset was not used in this study, due its small size.

    Remaining 20% of the positive sets for Mdm2/P53, Bcl2/Bak and c-Myc/Max including 30, 100 and 5 PPIMs, respectively that were obtained from TIMBAL were used as blind dataset. These sets were not used in training and testing. The negative blind sets were created in two subsets for each PPI complex, i.e. 1 : 1 (P : N) and 1 : 10 (P : N) randomly from PubChem (electronic supplementary material, table S1b).

    NCI database that was released in May 2012 consisting of 265 242 structures was processed and finally 216 103 structures were used as a large independent dataset. The structures that did not have xlogP3 value were removed.

    2P2I positive dataset consisting of 40 PPIMs was used for comparison study [21].

    All the positive datasets of Mdm2/P53, Bcl2/Bak and c-Myc/Max consisting of 250, 735 and 15 small molecules were used as a database in SDF two-dimensional format for drug-like similarity search algorithm.

    All the datasets used in this study are available in ‘about page’ of PPIMpred at http://bicresources.jcbose.ac.in/ssaha4/PPIMpred/about.php.

    We compiled the physico-chemical properties of both positive and negative datasets of all the three PPI complexes from PubChem. Initially, 18 descriptors of each chemical structure were extracted for feature selection. Student t-test was performed in Mdm2/P53 and Bcl2/Bak datasets for feature selection and finally 10 descriptors with a p-value of less than 0.05 were selected [22].

    The SVMlight package was used to classify the PPIM against the three PPI complexes [23]. The different kernels of SVM, namely, (i) linear, (ii) polynomial and (iii) radial basis function (RBF) kernel, were used for developing the models.

    We also used two other machine learning techniques, namely Naive Bayes and Random Forest methods by Weka tool [24].

    The positive training set chemicals from Mdm2/P53 and Bcl2/Bak were mapped to ChEMBL database [25]. Many of them were found to have IC50 values for specific PPIs. The IC50 values were extracted and converted to log scale and then compared with SVM scores. There was no reasonable report of known IC50 value of chemicals against Myc/Max.

    The fivefold cross-validation technique was used to analyse the performance of different classifiers. The dataset of small chemicals of three mentioned PPIs (Mdm2/P53, Bcl2/Bak and c-Myc/Max) was divided randomly into five subsets. The machine learning technique classifiers were trained on four sets and performance was measured on the fifth set. The process was continued up to five times so that each set could be used for testing. The average performance of classifiers on five sets is considered to be the final performance. The threshold-dependent parameters sensitivity, specificity, accuracy, precision (PPV), F1 score were used. Also, threshold-independent parameter area under receiver operating characteristic (ROC) curve was also measured.

    Sensitivity=TPTP+FNSpecificity=TNTN+FPAccuracy=TP+TNTP+FN+TN+FPPPV=TPTP+FPF1=2TP2TP+FP+FN.

    The predicted SVM scores of known positive and negative sets were plotted in a histogram, and an unknown predicted query score was used to validate it from these plots by computing the area under the curve (AUC). If a predicted SVM score has a higher AUC in the positive plot then the confidence of the prediction to be positive PPIM will be higher (details in electronic supplementary material, figure S1).

    The randomized datasets of Mdm2/P53 and Bcl2/Bak were prepared using the 1 : 1 (positive:negative) datasets of 0.99 chemical structural similarity. The positive and negative labels were assigned randomly. These randomized datasets were used for fivefold cross-validation using SVM-based method and threshold-dependent and -independent measures were computed.

    We implemented a similarity searching method to find similar chemicals against user query input (a structure or a mol file). ChemmineR package was used [26,27], where a function cmp.similarity function computes the atom pair similarity between two compounds using the Tanimoto coefficient as similarity measure. The function returned a data frame where the rows were sorted by the Tanimoto similarity score (best to worst). It is the proportion of the atom pairs shared among two compounds divided by their union. The formula of Tanimoto similarity is

    Tanimoto coefficient=ca+b+c.

    The variable c is the number of atom pairs common in both compounds, where a and b are the numbers of their unique atom pairs.

    Docking was performed using AutoDock Vina [28]. Three-dimensional structures of small molecules from NCI dataset were taken as ligands and the crystal structures of Mdm2 (PDB id: 1YCR), Bcl2 (PDB id: 2XA0) and Myc (PDB id: 1NKP) were taken as receptors (targets). Ligands and receptors were prepared using AutoDockTools for docking [29]. Docking study was focused on the small molecule sets with highest, lowest and randomly selected SVM predicted scores. The known PPIMs from training/testing datasets against Mdm2/P53 and Bcl2/Bak were also subjected to docking studies.

    In this study, three PPI targets, namely, Mdm2/P53, Bcl2/Bak and c-Myc/Max, were considered. The data distribution of PPIs from TIMBAL and specific chemical targets are shown through a pie chart in electronic supplementary material, figure S2, which clearly shows that Mdm2/P53 and Bcl2/Bak were the top PPI hits of known PPIMs. Although Myc/Max has few hits, it was considered due to its biological significance in disease biology. Besides these, PDB structures of these complexes were available.

    Eighteen physico-chemical features were extracted for each chemical in the positive and negative PPIM sets. A t-test was performed to filter out the non-significant features in targeting three PPI complexes, Mdm2/P53, Bcl2/Bak and c-Myc/Max, as shown in electronic supplementary material, table S2a–c. The same set of 10 descriptors including (i) molecular weight, (ii) xlogp3, (iii) hydrogen bond donor count, (iv) rotatable bond count, (v) topological polar surface area, (vi) heavy atom count, (vii) complexity, (viii) defined atom stereocentre count, (ix) defined bond stereocentre count, and (x) covalently bonded unit count were shown to be significant with p-value of less than 0.05 in Mdm2/P53 and Bcl2/Bak datasets. Thus for further analysis, we selected these 10 descriptors for evaluation of the machine-learning techniques. Although the trend was different in c-Myc/Max dataset, probably due to a small number of positive examples (n = 15).

    Threshold-dependent and independent measures were used to classify the PPIMs against three PPI complexes (Mdm2/P53, Bcl2/Bak and c-Myc/Max) using SVM, Naive Bayes and Random Forest as shown in table 1a–c. It is important to remember that in Mdm2/P53 dataset there were 250 positive and 1790 negative (random, Myc/Max positive and Bcl2/Bak positives) small chemicals, whereas in Bcl2/Bak dataset there were 735 positives and 1305 negatives (random, Myc/Max and Mdm2/P53) and in Myc/Max there were only 15 positives and 150 negatives (random). Among three SVM kernels, RBF was performing better in Mdm2/P53 and Bcl2/Bak datasets as shown in electronic supplementary material, table S3a–c, and ROC plots in electronic supplementary material, figure S3a–c, and the density plots of positive and negative training sets in electronic supplementary material, figure S4a–c. Although Random Forest performed best in terms of overall accuracy and AUC, the sensitivity was higher in SVM RBF kernel in all the three sets.

    Table 1(a) Comparison of performance on Mdm2/P53 (1 : 7) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method. (b) Comparison of performance on Bcl2/Bak (1 : 2) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method. (c) Comparison of performance on c-Myc/Max (1 : 10) testing dataset (fivefold cross-validation) using three different kernels of SVM (linear, polynomial and radial basis function), Naive Bayes and Random Forest method.

    methodssensitivityspecificityaccuracyF1 scorePPVAUC
    (a)
    SVM linear0.680.710.700.360.410.77
    c = 1, j = 1
    SVM poly0.640.600.610.350.320.63
    d = 1, c = 1, j = 3
    SVM RBF0.830.820.830.450.570.88
    g = 0.0001, c = 10, j = 8
    Naive Bayes0.160.970.870.220.390.83
    Random forest0.690.990.950.770.880.93
    (b)
    SVM linear0.730.600.650.670.620.69
    c = 1, j = 2
    SVM poly0.600.490.530.490.480.61
    d = 1, c = 1, j = 3
    SVM RBF0.860.750.790.720.770.83
    g = 0.0001, c = 1, j = 2
    Naive Bayes0.700.870.810.730.760.87
    Random forest0.870.940.920.880.900.95
    (c)
    SVM linear0.800.930.920.600.650.89
    c = 1, j = 1
    SVM poly0.80.900.890.470.580.89
    d = 1, c = 10, j = 3
    SVM RBF0.870.910.900.500.630.91
    g = 0.0001, c = 1, j = 1
    Naive Bayes0.670.950.930.630.590.86
    Random forest0.40.990.940.550.860.89

    In addition, the positive and negative datasets with 1 : 1 (positive: negative) of 0.99, 0.90 and 0.80 structural similarity and with randomized dataset for Mdm2/P53 and Bcl2/Bak were used for training and testing using SVM RBF kernel. The ROC plots in figure 1a,b show that the AUC was maximum in 0.99 structural similarity for both Mdm2/P53 and Bcl2/Bak. The AUC values were decreasing in the case of 0.90 and 0.80 in Mdm2/P53 and Bcl2/Bak datasets. These plots show that the randomized dataset AUC values were 0.55 and 0.52 for Mdm2/P53 and Bcl2/Bak datasets, respectively. Similar trend was observed in the other dataset where number of negative examples were higher than positive set (electronic supplementary material, figure S5a,b and tables S4a,b, S5a,b and S6a,b).

    What happens when a red blood cell dies?

    Figure 1. (a) The ROC plot for Mdm2/P53 1 : 1 (P : N) dataset with 0.99 similarity (blue), 0.90 similarity (red), 0.80 similarity (green) and randomization trial (black). (b) The ROC plot for Bcl2/Bak 1 : 1 (P : N) dataset with 0.99 similarity (blue), 0.90 similarity (red), 0.80 similarity (green) and randomization trial (black).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The small chemical ligands considered in positive training sets of the two PPIs (Mdm2/P53, Bcl2/Bak) were mapped to the ChEMBL database and a structure–activity relationship parameter, i.e. IC50 values, was studied. The IC50 values for Mdm2/P53 and Bcl2/Bak were taken and a comparison was drawn among the log transformed IC50 values and predicted SVM scores, as shown in electronic supplementary material, figure S6a,b. In the case of Mdm2/P53, about 89% chemicals have SVM score above 0.5 with reasonable IC50 values. Similar trend was observed for Bcl2/Bak, of about 82% chemicals having SVM score above 0.5 with reasonable IC50 values.

    To establish the confidence of effectiveness of the method, the comparison among docking scores, i.e. binding free energy values and SVM scores of the training set, was also drawn for Mdm2/P53 and Bcl2/Bak. The plots are available in electronic supplementary material, figure S7a,b. Interestingly, it is found that about 89% of chemicals for Mdm2/P53 and about 82% chemicals for Bcl2/Bak have SVM scores above 0.5 with docking scores less than −7 kcal mol−1. The docking studies showed that the small chemicals from training/testing set of Mdm2/P53 bind to Mdm2 at the P53 binding site (as shown in electronic supplementary material, figure S8a). Similarly, it was observed that the training set PPIMs of Bcl2/Bak bind to Bcl2 on the binding site of Bak (as shown in electronic supplementary material, figure S8b).

    The accuracies of the blind set in Mdm2/P53 in two different ratios (1 : 1 and 1 : 10) were 84% and 64% (electronic supplementary material, table S7a,b). In 1 : 10 dataset, the specificity decreases; thus the overall accuracy was low. However, the overall sensitivity in both the sets, i.e. 1 : 1 and 1 : 10, remains more or less the same. Thus, the SVM model developed using fivefold cross-validation with only 10 descriptors is able to predict the unknown set not used in training and testing the models with reasonable accuracy. The overall accuracies in blind datasets (1 : 1 and 1 : 10) in Bcl2/Bak were 66% and 63% (electronic supplementary material, table S8a,b) with higher sensitivities. A similar trend was observed in c-Myc/Max blind dataset assessment (electronic supplementary material, table S9a,b).

    NCI database consisting of more than 250 000 small chemicals was checked in the proposed SVM models of Mdm2/P53, Bcl2/Bak and c-Myc/Max. The predicted SVM scores above zero (to avoid false positives) of three models were plotted using histogram density plots function and a significant threshold was marked as shown in electronic supplementary material, figure S9a–c. A highly significant set of PPIMs were selected from these plots by choosing the threshold value (for Mdm2/P53 over 1.9, for Bcl2/Bak over 1.4 and for c-Myc/Max over 1.7) in the right-tail of the x-axis as shown in electronic supplementary material, table S10. Interestingly, the inhibitors predicted against three complexes (473, 466 and 232 for Mdm2/P53, Bcl2/Bak and c-Myc/Max, respectively) are mutually exclusive as there were no overlaps among them (electronic supplementary material, figure S10 and table S11).

    The top hits from SVM predicted PPIMs for all the three complexes were further used for in silico docking by AutoDock Vina and a significance test was performed by comparing with random and low scoring SVM scores of NCI chemicals set. The docking results were obtained in the form of binding free energies for an interaction of the small molecules with three protein targets (Mdm2, Bcl2 and Myc). The box plot of AutoDock scores binned in three different SVM predicted NCI molecule scores, i.e. top hit (n = 60), low hits (n = 60) and 10 random hits (n = 60*10), for Bcl2 is shown in figure 2 and for Mdm2 and Myc are shown in electronic supplementary material, figures S11a and S12a and the electronic supplementary material, table S12 shows the values obtained from boxplot. Similarly, the distribution plot of AutoDock scores in three different bins was also plotted (electronic supplementary material, figures S11b, S12b and S13). The names of the 60 top NCI small chemical hits against three targets used in this study are available in electronic supplementary material, table S13a–c. The significance t-Test p-values for Bcl2, Mdm2 and Myc were 0.11, 0.56 and 0.14. Although these results were not significant at p-value of 0.05 level, at least in Bcl2 and in Myc it was significant at 0.1 level (p-value < 0.1). In summary, we observed that SVM scores predicted in top hits are better in comparison with random based on docking studies.

    What happens when a red blood cell dies?

    Figure 2. Box plot showing the binding free energy of top hits, low hits and random hits of Bcl2/Bak.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    There is no server dedicated for predicting specific PPIMs against Mdm2/P53, Bcl2/Bak and c-Myc/Max. However, there is a server, 2P2IHUNTER, which can predict whether a chemical can be an orthosteric PPIM. In this tool, SVM with a radial Gaussian basis kernel was used to train 40 non-redundant small molecules as a positive set and 1018 compounds as random (decoy) set. Within 40 PPIMs, there were seven inhibitors for Mdm2/P53, 10 inhibitors for Bcl2/Bak and none for c-Myc/Max (the Venn diagram is shown in electronic supplementary material, figure S14a, and details in table S14). The overall performance of their optimized model was a sensitivity of 63% and specificity of 100%. We have used this dataset in three of our optimized models specific for Mdm2/P53, Bcl2/Bak and c-Myc/Max. At a threshold value of 0.8, and using 40 PPIMs we observed 16 inhibitors specific for Mdm2/P53, 20 inhibitors for Bcl2/Bak and one for c-Myc/Max (shown in electronic supplementary material, tables S15 and S16). Interestingly, all the seven reported inhibitors for Mdm2/P53, 10 reported inhibitors for Bcl2/Bak among 40 were picked up by SVM models based on radial basis kernel (RBF) (the Venn diagram is shown in electronic supplementary material, figure S14b). 2P2I hunter used 11 descriptors, where as we used 10 descriptors, with five common descriptors (logP, molecular weight, topological surface area, hydrogen donor and rotation bond).

    Three different SVM-based models were used to develop PPIMpred web server for classification of PPIMs targeting Mdm2/P53, Bcl2/Bak and c-Myc/Max PPIs. There are two separate input pages, namely ‘molecule search’ and ‘similarity search’, in the web server for end users. The molecule search has two options: single molecule search and batch input as shown in figure 3a. Single molecule search option allows users to provide molecular descriptors, target selection (Mdm2/P53, Bcl2/Bak and c-Myc/Max) and defining threshold value. The molecular descriptors include (i) molecular weight, (ii) XLogP3, (iii) hydrogen bond donor count, (iv) rotatable bond count, (v) topological polar surface area, (vi) heavy atom count, (vii) complexity, (viii) defined atom stereocentre count, (ix) defined bond stereocentre count, and (x) covalently bonded unit count. Batch input option allows users to upload a file containing the above 10 descriptor information of a list of molecules in a comma delimited format (.csv file). Similarity search page allows users to draw desired chemical structure using JME tool or directly paste MOL file of the desired chemical to the text area and after that choosing a target PPI from radio button as shown in figure 3c.

    What happens when a red blood cell dies?

    Figure 3. (a) The home page consisting of submission form for molecular descriptors, target selection and threshold value selection. (b) Result page of prediction shows ‘prediction result’, ‘tabular result’ and ‘graphical result’. (c) Similarity search page where users can input a molecule either by drawing using JME editor or by pasting MOL 2D format file. (d) Similarity search result shows the list of compounds similar to the query structure.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The output from PPIMpred for molecular search option has three sections: (i) prediction result, (ii) tabular result, and (iii) graphical result as shown figure 3b. In the prediction result section users get a number of positive and negative PPIMs, and the tabular section displays SVM score and prediction confidence measured in terms of positiveness and negativeness. In the case of batch upload option, result summary is shown in the graphical section as pie charts. In addition to it, in similarity search option, the server provides similar structures used for known PPIMs ranked based on Tanimoto similarity score of chemical input query. Each hit from PubChem Id is hyperlinked to PubChem CID [18] for further information as shown in figure 3d.

    The clinically tested PPIMs for the target PPIs were already present in the positive training/testing set. They were found to have an SVM score above 0.5; Nutlin-3a (Mdm2/P53) has an SVM score of 1.17, ABT-263 (Bcl2/Bak) has an SVM score of 1.01 and GX15-070(Bcl2/Bak) has an SVM score of 0.56 (shown in electronic supplementary material, table S17). The position of the chemicals in density plot is shown in electronic supplementary material, figure S15a,b.

    Focus of this study was to develop a user-friendly, and publicly accessible web server to identify lead PPIM molecules in silico for three clinically relevant protein complexes, because experimental screening of the huge chemical spaces is a relentless task. Our proposed PPIMpred web server can be useful for high-throughput screening of large chemical datasets for lead recognition. Machine-learning methods were used for classification of the data. SVM with three different kernels (linear, polynomial and RBF) was used to find the optimal model for classification. Naive Bayes and Random Forest methods of machine learning were also performed. In addition to categorical classification, it also gives hints of structural similarity with known drug-like molecules for further insights. PPIMpred has two separate search pages for finding predicted PPIMs and for similarity search. For making the search user-friendly, a batch input option is also present. The comparison analysis of known chemicals in the training and testing sets against Mdm2/P53 and Bcl2/Bak predicted SVM score with IC50 and predicted SVM score with docking score were performed. For further validation of the method, docking study was also performed on the top hits, low scoring hits as well as random hits obtained from validation of independent set. The docking results are analysed using statistical methods like boxplot and density plot. The docking study shows the top scoring molecules to be better modulators for the PPIs. The screening of a large chemical dataset of NCI gives exclusive hits for the three PPIs that are focused in our study, namely, Mdm2/P53, Bcl2/Bak and c-Myc/Max. These hits can be further subjected to in silico as well as experimental approaches for identification of lead candidates.

    All supporting information like tables and figures is included in the electronic supplementary material.

    S.S. conceived and designed the experiments. T.J., A.G. and S.D.M. performed the experiments. T.J., A.G., S.S. and R.B. analysed the data. T.J., A.G., S.D.M. and S.S. wrote the paper.

    The authors have no competing interests.

    This project was supported by Department of Biotechnology for Ramalingaswami fellowship (BT/RLF/Re-entry/11/2011). T.J. thanks Indian Council of Medical Research for Senior Research Fellowship.

    We thank Dr Smarajit Polley for critically reading the manuscript.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Wells J, McClendon C. 2007Reaching for high-hanging fruit in drug discovery at protein–protein interfaces. Nature 450, 1001–1009. (doi:10.1038/nature06526) Crossref, PubMed, ISI, Google Scholar

    • 2

      Taylor Iet al.2009Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat. Biotechnol. 27, 199–204. (doi:10.1038/nbt.1522) Crossref, PubMed, ISI, Google Scholar

    • 3

      Mullard A. 2012Protein–protein interaction inhibitors get into the groove. Nat. Rev. Drug Discov. 11, 173–175. (doi:10.1038/nrd3680) Crossref, PubMed, ISI, Google Scholar

    • 4

      Murray JK, Gellman SH. 2007Targeting protein-protein Interactions: lessons from p53/MDM2. Pept. Sci. 88, 657–686. (doi:10.1002/bip.20741) Crossref, ISI, Google Scholar

    • 5

      Ivanov A, Gnedenko O, Molnar A, Mezentsev Y, Lisitsa A, Archakov A. 2007Protein–protein interactions as new targets for drug design: virtual and experimental approaches. J. Bioinform. Comput. Biol. 05, 579–592. (doi:10.1142/S0219720007002825) Crossref, ISI, Google Scholar

    • 6

      Arkin MR, Wells JA. 2004Small-molecule inhibitors of protein–protein interactions: progressing towards the dream. Nat. Rev. Drug Discov. 3, 301–317. (doi:10.1038/nrd1343) Crossref, PubMed, ISI, Google Scholar

    • 7

      White A, Westwell A, Brahemi G. 2008Protein–protein interactions as targets for small-molecule therapeutics in cancer. Expert Rev. Mol. Med. 10, e8. (doi:10.1017/S1462399408000641) Crossref, PubMed, Google Scholar

    • 8

      Hamon Vet al.20132P2IHUNTER: a tool for filtering orthosteric protein-protein interaction modulators via a dedicated support vector machine. J. R. Soc. Interface 11, 20130860. (doi:10.1098/rsif.2013.0860) Link, ISI, Google Scholar

    • 9

      Nero T, Morton C, Holien J, Wielens J, Parker M. 2014Oncogenic protein interfaces: small molecules, big challenges. Nat. Rev. Cancer 14, 248–262. (doi:10.1038/nrc3690) Crossref, PubMed, ISI, Google Scholar

    • 10

      Gandhi Let al.2011Phase I study of navitoclax (ABT-263), a novel Bcl-2 family inhibitor, in patients with small-cell lung cancer and other solid tumors. J. Clin. Oncol. 29, 909–916. (doi:10.1200/JCO.2010.31.6208) Crossref, PubMed, ISI, Google Scholar

    • 11

      Hwang J, Kuruvilla J, Mendelson D, Pishvaian M, Deeken J, Siu L, Berger MS, Viallet J, Marshall JL. 2010Phase I dose finding studies of Obatoclax (GX15-070), a small molecule Pan-BCL-2 family antagonist, in patients with advanced solid tumors or lymphoma. Clin. Cancer Res. 16, 4038–4045. (doi:10.1158/1078-0432.CCR-10-0822) Crossref, PubMed, ISI, Google Scholar

    • 12

      Shangary S, Wang S. 2009Small-molecule inhibitors of the MDM2-p53 protein-protein interaction to reactivate p53 function: a novel approach for cancer therapy. Annu. Rev. Pharmacol. Toxicol. 49, 223–241. (doi:10.1146/annurev.pharmtox.48.113006.094723) Crossref, PubMed, ISI, Google Scholar

    • 13

      Blackwood E, Eisenman R. 1991Max: a helix-loop-helix zipper protein that forms a sequence-specific DNA-binding complex with Myc. Science 251, 1211–1217. (doi:10.1126/science.2006410) Crossref, PubMed, ISI, Google Scholar

    • 14

      Berg T. 2010Small-molecule modulators of c-Myc/Max and Max/Max interactions. Curr. Top. Microbiol. Immunol. 348, 139–149. (doi:10.1126/science.2006410) ISI, Google Scholar

    • 15

      Kato G, Lee W, Chen L, Dang C. 1992Max: functional domains and interaction with c-Myc. Genes Dev. 6, 81–92. (doi:10.1101/gad.6.1.81) Crossref, PubMed, ISI, Google Scholar

    • 16

      van Delft MF, Huang DCS. 2006How the Bcl-2 family of proteins interact to regulate apoptosis. Cell Res. 16, 203–213. (doi:10.1038/sj.cr.7310028) Crossref, PubMed, ISI, Google Scholar

    • 17

      Dai H, Ding H, Meng X, Lee S, Schneider P, Kaufmann S. 2013Contribution of Bcl-2 phosphorylation to Bak binding and drug resistance. Cancer Res. 73, 6998–7008. (doi:10.1158/0008-5472.CAN-13-0940) Crossref, PubMed, ISI, Google Scholar

    • 18

      Higueruelo A, Schreyer A, Bickerton G, Pitt W, Groom C, Blundell T. 2009Atomic interactions and profile of small molecules disrupting protein-protein interfaces: the TIMBAL database. Chem. Biol. Drug Des. 74, 457–467. (doi:10.1111/j.1747-0285.2009.00889.x) Crossref, PubMed, ISI, Google Scholar

    • 19

      Morelli X, Hupp T. 2012Searching for the Holy Grail; protein–protein interaction analysis and modulation. EMBO Rep. 13, 877–879. (doi:10.1038/embor.2012.137) Crossref, PubMed, ISI, Google Scholar

    • 20

      Sunghwan Jet al.2015PubChem substance and compound database. Nucleic Acids Res. 44, D1202–D1213. PubMed, ISI, Google Scholar

    • 21

      Poroikov V, Filimonov D, Ihlenfeldt W, Gloriozova T, Lagunin A, Borodina Y, Stepanchikova AV, Nicklaus MC. 2003PASS biological activity spectrum predictions in the enhanced open NCI database browser. J. Chem. Inf. Comput. Sci. 43, 228–236. (doi:10.1021/ci020048r) Crossref, PubMed, Google Scholar

    • 22

      Haury A, Gestraud P, Vert J.2011The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS ONE 6, e28210. (doi:10.1371/journal.pone.0028210) Crossref, PubMed, ISI, Google Scholar

    • 23

      Joachims T. 2002Learning to classify text using support vector machines. Boston, MA: Kluwer Academic Publishers. Crossref, Google Scholar

    • 24

      Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I.2009The WEKA data mining software. SIGKDD Explor. Newsl. 11, 10. (doi:10.1145/1656274.1656278) Crossref, Google Scholar

    • 25

      Gaulton Aet al.2016The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954. (doi:10.1093/nar/gkw1074) Crossref, PubMed, ISI, Google Scholar

    • 26

      Cao Y, Charisi A, Cheng L, Jiang T, Girke T. 2008ChemmineR: a compound mining framework for R. Bioinformatics 24, 1733–1734. (doi:10.1093/bioinformatics/btn307) Crossref, PubMed, ISI, Google Scholar

    • 27

      Chakraborty J, Jana T, Saha S, Dutta T. 2014Ring-hydroxylating oxygenase database: a database of bacterial aromatic ring-hydroxylating oxygenases in the management of bioremediation and biocatalysis of aromatic compounds. Environ. Microbiol. Rep. 6, 519–523. (doi:10.1111/1758-2229.12182) Crossref, PubMed, ISI, Google Scholar

    • 28

      Trott Olson A. 2009AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461. (doi:10.1002/jcc.21334) ISI, Google Scholar

    • 29

      Morris G, Huey R, Lindstrom W, Sanner M, Belew R, Goodsell DS, Olson AJ. 2009AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791. (doi:10.1002/jcc.21256) Crossref, PubMed, ISI, Google Scholar


    Page 5

    As cities are expanding and becoming increasingly integrated [1,2], the future of public transportation systems (PTS) is bright. In 2014, a record of 10.8 billion trips were made by public transportation in the USA [3]. Moreover, with constant urbanization and a strengthening of urban cores [4], the role that PTS will have to play in cities in the future can only increase. This is desirable for several reasons. To start, PTS can significantly help reduce traffic congestion that has a significant toll on urban economies every year. They also tend to be more sustainable than private vehicles from the viewpoint of greenhouse gas emissions [5]. Nonetheless, they are complex, and evaluating their performance over time presents significant challenges, especially as they depend on many different variables that continuously evolve over time [6]. The performances of PTS are typically evaluated by looking at a range of metrics that can vary from agency to agency but that are rarely combined to get an overall performance assessment measure of a system. For example, the Chicago Transit Authority (CTA) focuses on six core areas of service: ridership, schedule, efficiency, cleanliness, safety and courteousness [7]. In the scientific literature, many studies have developed their own performance metrics to evaluate PTS [8–13], notably wrestling with issues of scale (e.g. state versus regional versus city level) [6,8,14,15].

    Recently, significant advances in data science and information science have enabled the development of new and powerful techniques to analyse urban data [16–22]. Within this general context, in this article, we introduce Fisher information (FI) and show how it can be used to combine relevant metrics into one performance measure. FI can specifically be used to measure the ‘stability’ of a system. To compute FI, we use a Python script developed by the authors and published in [23]. On the one hand, PTS consist of many modes, from bus to heavy rail, that depend on many factors. On the other hand, seasonal fluctuations in ridership are common in almost all transit systems, making the analysis of monthly data difficult.

    In information theory, complex systems are considered to be dynamic, orderly and well organized, but they also have the potential to undergo abrupt changes that can dramatically alter their performance. These changes are commonly referred to as regime shifts, e.g. eutrophication of lakes and coastal oceans and regional climate change [24]. Regime shifts happen in PTS as well, from the introduction of a new transit mode to sudden and substantial changes in ridership. FI is a key method developed by Fisher [25] that offers a means of measuring the amount of information about an unknown parameter (e.g. performance) based on current observations, and it has been used to assess dynamic order in real and model systems [24,26–29]. Moreover, FI is particularly able to combine many variables to assess the overall performance and stability of a system.

    FI can be significantly useful to transit planners for three main reasons. First, FI provides an effective measure of overall performance of a PTS, and in particular, it is able to detect early warning signs that may lead to regime shifts. Second, it provides practical information to transit planners on which other PTS is going or has gone through similar situations. Third, because FI is not sensitive to differences in scale of the input variables, it can be used to categorize PTS across an entire region or a country (e.g. overall assessment of transit across the USA).

    The main objective of this study is to measure the stability, order and regime shifts in PTS over time in all urbanized areas (UZAs) of the USA. More specifically, this article aims to:

    • — recall concepts of FI and explain how it can be applied to study the stability of PTS,

    • — compute FI for all PTS in the 372 UZAs reported by the National Transit Database (NTD),

    • — analyse and interpret results in FI to detect patterns in the evolution of PTS, and

    • — categorize PTS based on the interpretation from the computed FI.

    To achieve these goals, monthly public transit data from 2002 to 2016 were collected from the NTD [30]. In particular, we use unlinked passenger trips (UPTs) and vehicle revenue miles (VRM) for all modes reported in the NTD. Using these data, we can then compute FI for all US transit systems, and we can assess the overall pattern in the 14-year period. Overall, we find the presence of eight different patterns. Specifically, we detect regime shifts in 254 PTS, and we observe decreasing FI trends for 308 PTS, which may lead to regime shifts [31]. Additionally, we find increasing FI trends for 136 PTS. An open source Python library coded by the authors has been used to compute the FI. The code can be freely downloaded from GitHub, see [32], and a tutorial is available from the authors' main website [33]. Full details on FI are not provided here, but the reader is referred to the report of the US Environmental Protection Agency [34] and Ahmad et al. [23] that clearly explain how FI is calculated step by step.

    FI was developed by the statistician Fisher [25], and it offers a means to measure indeterminacy. In other words, FI can measure the amount of information about an unknown parameter θ that is present in observable data X. In this section, we briefly recall the fundamentals of FI, but more details can be found in [23,32]. Mathematically, the FI available about θ, I(θ), is defined as [24]

    I(θ)=∫dXp0(X|θ)[∂p0(X|θ)∂θ]2,2.1

    where p0 (X|θ) is the probability of observing a particular value of X in the presence of θ.

    In practice, it is essentially impossible to use equation (2.1) because the computation of the partial derivative (∂p0(X|θ)/∂θ) is required for this process, and it depends on the numeric value of the unknown parameter θ. Through numerous derivation steps, Mayer et al. [29] adapted this equation for application to real systems:

    I=∫dsp(s)[dp(s)ds]2.2.2

    Based on the probability of observing various states of the system p(s), equation (2.2) is the foundational form of FI used in this work. Karunanithi et al. [24] further simplified this equation to compute FI numerically for systems characterized by discrete data

    FI≈4∑i−1m[qi−qi+1]2,2.3

    where m is the number of states of the system. A state is defined as a condition of the system determined by specifying a value for each of the variables that characterize its behaviour [24] and q ≡ p1/2 is the amplitude of p(s). Complete details on FI, related derivations and calculation methodology can be found in [23,34].

    The Sustainable Regimes Hypothesis was developed to provide a construct for interpreting FI [24,26,35], and it includes four main principles that form the foundation of our interpretation of the FI results:

    • — A steady increase in FI indicates that the system is becoming more organized/stable.

    • — A system is considered to be in an orderly dynamic regime when a non-zero FI remains nearly constant over time (i.e. d⟨FI⟩/dt≈0).

    • — A steady decrease in FI insinuates that the system is losing its functionality, stability and the dynamic behaviour patterns are breaking down. This declining trend may provide warning of an impending regime shift [31].

    • — A regime shift between two stable dynamic regimes is characterized by a sharp drop in FI followed by a recovery or rebound.

    As a limit of FI, we note that while it can measure whether a system is stable, orderly or going through a regime shift, it cannot discern the particular variable in X that is causing a particular change.

    PTS are commonly considered as the most sustainable motorized transportation systems, and they have been present in the USA for more than a century. With the significant change in technology, economy and socio-political environment, PTS have also changed substantially over time and have undergone several regime shifts. For a history of PTS, the reader is referred to Vuchic [36]. By looking at the historical data, FI can therefore be used to track the changes experienced by PTS over time, which can help better understand patterns in ridership for instance. More specifically, we look into the FI for all UZAs in the USA as reported by the NTD for (i) rail, (ii) bus, (iii) others (i.e. modes which are neither rail nor bus) and (iv) all (i.e. overall performance).

    The NTD defines two major categories of PTS modes: rail and non-rail. Moreover, among non-rail modes, the bus is by far the most predominant. The modes in the rail category used for this analysis are commuter rail (CR), heavy rail (HR), hybrid rail (YR), light rail (LR) and monorail/automated guideway (MG). The modes in the bus category used are commuter bus (CB), bus (MB), bus rapid transit (RB) and trolleybus (TB). Finally, other modes are also present, that we refer to as ‘others’, and they include: Alaska railroad (AR), cable car (CC), inclined plane (IP), street car rail (SR), demand response (DR), demand response—taxi (DT), aerial tramway (TR), ferryboat (FB), jitney (JT), publico (PB) and vanpool (VP) [30].

    In this work, PTS data for all US public transportation authorities have been collected from the NTD that reports monthly data from January 2002 for four main types of data: UPTs, VRM, vehicle revenue hours and vehicles operated in maximum service (peak vehicles). Because the three latter are heavily correlated, for our analysis, we solely use UPT and VRM. UPT is defined as the number of passengers who board public transportation vehicles, and VRM is defined as the miles that are travelled by vehicles while in revenue service [30]. Another way to consider these datasets is that UPT offers an indicator of the demand for transit, while VRM offers an indicator of the supply. Moreover, we used all data points from January 2002 to December 2016.

    First, as already mentioned, the data for each PTS in each UZA are divided into four different systems: (i) rail, (ii) bus, (iii) other and (iv) all. Second, the total UPT and VRM for all the UZAs are analysed to get an idea of overall public transit patterns in the USA. As a UZA can host several transit agencies that can have multiple transit modes, each system is defined by its total number of transit modes across all agencies. As an example, for the rail mode in the Chicago UZA, there are three different rail modes run by three different transit agencies: the CTA, Metra and the Northern Indiana Commuter Transportation District. Information for both UPT and VRM for these three modes is available. As we collect two variables for each system (UPT and VRM), the Chicago rail mode is represented by six variables (i.e. two variables per mode). Overall, the Chicago has a total of 32 transit modes, giving us 64 variables.

    Subsequently, using the procedure described in [23,34], FI for all UZAs present in the NTD was computed. The window size selected to measure FI was 12 for 12 calendar months since the NTD reports monthly data. Moreover, we choose a window increment of 1. In other words, we first compute the FI for January 2002 to December 2002, we then compute the FI for February 2002 to January 2003, and so on until December 2016. We then calculate the yearly average to assess yearly performance; note that this two-step process allows us to account for seasonal variations in our calculations while outputting a yearly result.1

    On a technical note, FI requires a ‘size of state’ defined by the size of the hyper-rectangle used [24], where any point outside of this hyper-rectangle forms a new state. In this work, we calculated the standard deviation of all windows and identified the smallest value as an indicator of the stable period. We then used Chebyshev's inequality [37] that multiplies the standard deviation by two to obtain the size of state for that particular variable. More information can be found in [23,34].

    As mentioned earlier, FI can increase or decrease or remain stable over a period of time. It can also undergo a sharp decrease, suggesting the occurrence of a regime shift. In this article, we define a drop in FI of 3 or greater as a regime shift. Moreover, in order to find the presence of an increasing or decreasing trend, we applied a Mann–Kendall non-parametric test [38,39] with a confidence level of 95%. This non-parametric test yields three different outputs for the overall pattern: (i) increasing, (ii) decreasing and (iii) no detectable pattern. Moreover, we also detected how much the system is able to rebound after a regime shift or a decrease. The detection of a rebound of more than 75% of an original value is defined as ‘full rebound’, whereas a rebound in between 25 and 75% of the original value is defined as a ‘partial rebound’, and failure of rebounding to at least 25% of the original value is defined as ‘no rebound’ in this analysis. If the Mann–Kendall non-parametric test failed to detect any pattern and no regime shift is spotted, then the evolution of those FI is also classified as ‘no pattern/others’. Finally, we performed a frequency analysis to find the number of UZAs that belong to the different patterns in the evolution of the FI.

    The total UPT and VRM from 2002 to 2016 for eight UZAs are shown in figures 1 and 2. As defined in the NTD, the UZAs include New York–Newark, NY–NJ–CT; Washington, DC–VA–MD; Chicago, IL–IN; Boston, MA–NH–RI; San Francisco–Oakland, CA; Philadelphia, PA–NJ–DE–MD; Atlanta, GA; and Sacramento, CA. Moreover, because the data are relatively noisy, we only show yearly averages (despite the fact that we use the original monthly data for the computation of FI).

    What happens when a red blood cell dies?

    Figure 1. Total UPT for eight major UZAs.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 2. Total VRM for eight major UZAs.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Using these data, the FI of rail, bus, other and all were calculated for all 372 UZAs reported by NTD. Overall, we observe eight different patterns in the evolution of FI based on the Sustainable Regimes Hypothesis mentioned earlier. Specifically, we are able to detect patterns for 698 out of 1146 different PTS; the other PTS either do not exist (e.g. Milwaukee, WI, does not have a rail system), or no detectable pattern can be found. The eight patterns are listed in table 1. Moreover, table 1 also shows the frequency analysis for all the four categories (i.e. bus, rail, other and all). We can notably observe 254 PTS with regime shifts. We also detect 308 PTS with decreasing FI, and 136 PTS with increasing FI. Figure 3 shows the FI for eight UZAs representing the eight different patterns observed. Moreover, the FI for the eight major UZAs used in figures 1 and 2 are shown in figure 4. The same results for all PTS can be found in [40].

    Table 1.Patterns in the evolution of FI.

    frequency analysis
    patternpropertiesillustrationexamplebusrailotherall
    regime shift with rebounddrop in FI of 3 or greater and a rebound of 75% or greater from the minimum FI
    What happens when a red blood cell dies?
    • bus—Richmond, VA

    • rail— Little Rock, AR

    • other— Utica, NY

    • all—Utica, NY

    3613634
    regime shift with partial rebounddrop in FI of 3 or greater and a rebound of 25% to 75% from the minimum FI
    What happens when a red blood cell dies?
    • bus—Ithaca, NY

    • rail—Sacramento, CA

    • other—Green Bay, WI

    • all—Corvallis, OR

    2552418
    regime shift without rebounddrop in FI of 3 or greater without any rebound from the minimum FI
    What happens when a red blood cell dies?
    • bus—Salem, OR

    • rail—Portland, ME

    • other— Mount Vernon, WA

    • all—Burlington, VT

    2762022
    decrease with reboundgradual decrease in FI with a rebound of 75% or greater from the minimum FI
    What happens when a red blood cell dies?
    • bus—Lancaster–Palmdale, CA

    • other—Dover–Rochester, NH–ME

    1010
    decrease with partial reboundgradual decrease in FI with a rebound of 25–75% from the minimum FI
    What happens when a red blood cell dies?
    • bus— Seattle, WA

    • rail—Portland, OR–WA

    • other—Medford, OR

    • all—New York–Newark, NY–NJ–CT

    4485059
    decrease without reboundgradual decrease in FI without any rebound
    What happens when a red blood cell dies?
    • bus—Boston, MA–NH–RI

    • rail—Chicago, IL–IN

    • other—Eugene, OR

    • all—Yakima, WA

    40103857
    increasegradual increase in FI
    What happens when a red blood cell dies?
    • bus—New Haven, CT

    • rail—Memphis, TN–MS–AR

    • other—Fairbanks, AK

    • all—Rochester, NY

    4554541
    no pattern/ othersno detectable pattern
    What happens when a red blood cell dies?
    • bus—Buffalo, NY

    • rail—Springfield, MA–CT

    • other—Portland, ME

    • all—Raleigh, NC

    154337158141

    What happens when a red blood cell dies?

    Figure 3. Evolution of FI for eight UZAs with: (a) regime shift with rebound, (b) regime shift with partial rebound, (c) regime shift without rebound, (d) decrease with rebound, (e) decrease with partial rebound, (f) decrease without rebound, (g) increase and (h) other.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 4. FI for eight major UZAs.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    From figure 1, we can see that the total UPT for the eight major cities remained stable from 2002 to 2016, except for some minor fluctuations. In New York, the total UPT for bus and other remained flat, whereas the UPT for rail and all increased slightly in 2004 and they have continued to increase with a mild slope since 2004; this suggests that the overall changes in ridership in New York depend chiefly on the rail modes. In Washington, DC, the total UPT for all the modes remained mostly uniform throughout the period, but the ridership for rail and all followed a particularly similar pattern, suggesting a dominance of the rail mode for overall ridership patterns. The evolution of the UPT in Chicago remained stable for all four modes, and the overall ridership pattern is analogous to the pattern in the bus mode, indicating that the bus may be the dominant public transportation mode in Chicago. For Boston, and akin to Washington and New York, the rail mode is dominant. In San Francisco–Oakland, like other UZAs, the ridership patterns were stable, except for a slight decrease in the years 2003 and 2009. Moreover, we also see that bus is the major public transportation mode as it follows the trends of the overall ridership pattern. Like Chicago, in Philadelphia, the total UPT trends for all the modes were stable, and the overall pattern was similar to the bus. For Atlanta, a small increase is observed until 2007, which is followed by a gradual decrease. Furthermore, between the years 2002 and 2006, the total UPT for the bus was slightly higher than that of rail, and the ridership for the remaining years remained nearly equal. Finally, for Sacramento, the UPT for all systems increased gradually until 2009 and it then decreased steadily until 2016.

    Looking at VRM, we see that except for New York, Washington, San Diego and Atlanta, the VRM of the other UZAs were stable from 2002 to 2016. New York experienced a jump in VRM in 2004, which then dropped in 2009, and remained around the 40 million mark until 2016. For Washington, DC, the VRM increased gradually from 2002 to 2016. Although rail ridership in Washington was higher than that of the bus, revenue miles for both the modes were almost similar. In Atlanta, we can observe that the VRM increased from 2004 to 2008, followed by a decrease until 2010, and then remained almost constant until 2016. Finally, the VRM patterns for Sacramento, CA, are similar to the patterns found for the UPT, except for the rail system that remained relatively stable after an increasing period between 2002 and 2005.

    From figure 1, we can identify that except for New York, Washington and Boston, the bus attracts most of the riders, which is also reflected in the higher bus VRM. Nonetheless, and despite the fact that the bus is dominant in Chicago and San Francisco, the rail VRM were higher from 2002 to 2016. In Philadelphia and Atlanta, the bus VRM were higher than all other modes.

    Focusing in FI results, from table 1, we can observe that the pattern ‘no pattern/others’ dominates for all categories. The majority of the UZAs for rail, bus, others and all fall in this category, which suggests that most PTS are either significantly stable or perpetually looking for stability. Among the other PTS, we can see that a significant number follow an increasing trend, including the bus systems in San Jose, New Orleans, Oklahoma City and Milwaukee. Moreover, from the PTS that experienced a regime shift, few were able to rebound completely or partially. Regime shifts are mostly observed in small UZAs because of the introduction or termination of a new mode, Sacramento offering a notable exception. In 2003, both the UPT and the VRM increase gradually, and yet the rail mode underwent a regime shift. A similar situation occurred in 2009, but no regime shift was detected. Moreover, the prime reason for rebound after a regime shift is also due to the introduction or termination of a new transit mode. In the case of a decreasing pattern (e.g. rail systems in Chicago and Washington), few PTS were able to rebound to their earlier state, and nearly an equal number of PTS have rebounded partially or failed to rebound after experiencing a decrease in FI. These results give somewhat of a cause for concern as decreasing patterns may exhibit a warning for an upcoming regime shift since uncertainty in the system is increasing (see Sustainable Regimes Hypothesis in §2.2).

    Figure 3 shows examples for the eight different patterns observed overall. A regime shift with rebound was found in the bus system of Richmond, VA. The rail system of Sacramento experienced a regime shift with partial rebound, whereas the rail system in Portland, ME, also experienced a regime but it failed to rebound. A decreasing trend in FI is found in Dover–Rochester, NH–ME, as well as in Seattle, WA, and Phoenix–Mesa, AZ. However, in Dover–Rochester, NH–ME, all systems managed to rebound completely, while the Seattle, WA, transit system rebounded partially, and the Phoenix–Mesa, AZ, bus system failed to rebound altogether. Moreover, the Oklahoma City, OK, bus system shows an increasing trend in FI, and no detectable pattern was found for the PTS of Buffalo, NY.

    From figure 4, we can observe that in New York, the rail, other and all modes have a decreasing trend in FI, and while the rail system was unable to rebound, the others and all categories only rebounded partially. Moreover, no detectable pattern was found for the bus system in New York. In Washington, all modes have a decreasing trend in FI, and only the other system was able to rebound partially; the other modes failed to rebound. For Chicago, no detectable patterns were found for the bus system, whereas the rail and all systems show a decreasing pattern without any rebound, and the other system shows a decreasing pattern with partial rebound. From analysing the FI for Boston, a decreasing pattern without any rebound is observed, except for the rail mode, for which no detectable patterns were found. Rail, other and all show a decreasing pattern without any rebound in San Francisco, but no detectable pattern was identified for the bus mode. For Philadelphia, a decreasing pattern with partial rebound is observed for the others and no detectable pattern was found for the remaining three systems. No detectable patterns were found for the rail and bus systems in Atlanta, whereas decreasing trends with partial rebound are observed for the remaining two. Finally, for Sacramento, a regime shift was found for the rail system in 2003, essentially capturing the increasing period for both the UPT and the VRM of the rail system.

    Overall, this analysis showed no negative regime shifts; we mostly observed regime shifts due to the introduction of a new mode. While this is desirable, we also noticed that several modes seem to be on a decreasing trend, which can lead to a regime shift or dysfunction if nothing is done (e.g. bus system in Boston and rail in Chicago). An analysis of the trend in FI can therefore help transit agencies identify whether their systems are currently maintaining or losing stability, leading to possible measures to improve performance.

    The main objective of this paper was to measure the stability, order and regime shifts of the PTS of all US UZAs by using concepts of FI. To achieve this goal, a Python code [32] was used to compute FI for all the PTS. In particular, UPT and VRM datasets from the NTD were collected and used for this analysis. FI for the bus, rail, other and all modes were computed for all 372 UZAs. We notably found the presence of eight different patterns, and the majority of the systems belong to the final category (i.e. no pattern/others). This suggests that most PTS are searching for a stable state. Among the remaining PTS, a significant number experienced a decrease in FI (i.e. 308 PTS), which is a cause for concern. By contrast, we also find optimistic trends for 136 UZAs with an increasing FI. Furthermore, several regime shifts were detected for different UZAs. A regime shift can either be positive, for example, for the case of introducing a new service (e.g. in San Diego), but it can also be negative, for example, when a needed service is terminated. Moreover, besides providing FI for eight UZAs showcasing the eight different patterns, FI for eight major UZAs were provided. Overall, PTS offer myriads of benefits, but they are also complex by nature. Considering PTS have a bright future ensuring their success is critical. This is particularly important as the era of autonomous vehicle is rapidly approaching and will likely result in a decrease in operation costs for PTS. FI offers a means to combine multiple variables of a complex system to determine its overall stability, which can prove to be valuable in practice.

    The data were collected from the National Transit Database (NTD) for this study. The Python library to compute Fisher information can be freely obtained from GitHub at https://github.com/csunlab/fisher-information.

    N.A., S.D. and H.C. designed the study. N.A. collected and analysed the data for the study. N.A., S.D. and H.C. interpreted the results and developed the manuscript.

    We declare we have no competing interests.

    This research was supported, in part, by NSF Awards 1331800 and 1551731, by the University of Illinois at Chicago Institute for Environmental Science and Policy (IESP) Pre-Doctoral Fellowship, and by the Department of Civil and Materials Engineering at the University of Illinois at Chicago.

    Footnotes

    1 This is akin to the problem of taking the mean of the squares versus the square of the mean.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Derrible S. In press. Urban infrastructure is not a tree: integrating and decentralizing urban infrastructure systems. Environ. Plan. B. (doi:10.1177/0265813516647063) Google Scholar

    • 2

      Derrible S. 2016Complexity in future cities: the rise of networked infrastructure. Int. J. Urban Sci.1–19. (doi:10.1080/12265934.2016.1233075) ISI, Google Scholar

    • 3

      APTA. 2015Record 10.8 billion trips taken on U.S. public transportation in 2014. See http://www.apta.com/mediacenter/pressreleases/2015/Pages/150309_Ridership.aspx (accessed 13 April 2017). Google Scholar

    • 4

      Delmelle E, Zhou Y, Thill J-C. 2014Densification without growth management? Evidence from local land development and housing trends in Charlotte, North Carolina, USA. Sustainability 6, 3975–3990. (doi:10.3390/su6063975) Crossref, ISI, Google Scholar

    • 5

      Derrible S, Saneinejad S, Sugar L, Kennedy C. 2010Macroscopic model of greenhouse gas emissions for municipalities. Transp. Res. Rec. J. Transp. Res. Board 2191, 174–181. (doi:10.3141/2191-22) Crossref, Google Scholar

    • 6

      Grant M, Plaskon T, Trainor S, Suter S. 2011State DOT public transportation performance measures: state of the practice and future needs. Washington, DC: Transportation Research Board. Google Scholar

    • 7

      Chicago Transit Authority. 2017Performance metrics. http://www.transitchicago.com/performance/default.aspx (accessed 13 April 2017). Google Scholar

    • 8

      Bertini R, El-Geneidy A. 2003Generating transit performance measures with archived data. Transp. Res. Rec. J. Transp. Res. Board 1841, 109–119. (doi:10.3141/1841-12) Crossref, Google Scholar

    • 9

      Derrible S, Kennedy C. 2009Network analysis of world subway systems using updated graph theory. Transp. Res. Rec. J. Transp. Res. Board 2112, 17–25. (doi:10.3141/2112-03) Crossref, Google Scholar

    • 10

      Henderson G, Kwong P, Adkins H. 1991Regularity indices for evaluating transit performance. See http://trid.trb.org/view.aspx?id=359612. Google Scholar

    • 11

      Kittelson, Associates et al.2003TCRP Report 88: a guidebook for developing a transit performance—measurement system. Google Scholar

    • 12

      Liao C-F, Liu H. 2010Development of data-processing framework for transit performance analysis. Transp. Res. Rec. J. Transp. Res. Board 2143, 34–43. (doi:10.3141/2143-05) Crossref, Google Scholar

    • 13

      Pratt R, Lomax T. 1996Performance measures for multimodal transportation systems. Transp. Res. Rec. J. Transp. Res. Board 1518, 85–93. (doi:10.3141/1518-15) Crossref, Google Scholar

    • 14

      Baird M, Stammer R. 2000Measuring the performance of state transportation agencies: three perspectives. Transp. Res. Rec. J. Transp. Res. Board 1729, 26–34. (doi:10.3141/1729-04) Crossref, Google Scholar

    • 15

      Cramer A, Cucarese J, Tran M, Lu A, Reddy A. 2009Performance measurements on mass transit: case study of New York City transit authority. Transp. Res. Rec. J. Transp. Res. Board 2111, 125–138. (doi:10.3141/2111-15) Crossref, Google Scholar

    • 16

      Cottrill CD, Derrible S. 2015Leveraging big data for the development of transport sustainability indicators. J. Urban Technol. 22, 45–64. (doi:10.1080/10630732.2014.942094) Crossref, ISI, Google Scholar

    • 17

      Ahmad N, Derrible S. 2015Evolution of public supply water withdrawal in the USA: a network approach. J. Ind. Ecol. 19, 321–330. (doi:10.1111/jiec.12266) Crossref, ISI, Google Scholar

    • 18

      Derrible S, Ahmad N. 2015Network-based and binless frequency analyses. PLoS ONE 10, e0142108. (doi:10.1371/journal.pone.0142108) Crossref, PubMed, ISI, Google Scholar

    • 19

      Karduni A, Kermanshah A, Derrible S. 2016A protocol to convert spatial polyline data to network formats and applications to world urban road networks. Sci. Data 3, 160046. (doi:10.1038/sdata.2016.46) Crossref, PubMed, ISI, Google Scholar

    • 20

      Kermanshah A, Derrible S. 2017Robustness of road systems to extreme flooding: using elements of GIS, travel demand, and network science. Nat. Hazards 86, 151–164. (doi:10.1007/s11069-016-2678-1) Crossref, ISI, Google Scholar

    • 21

      Kermanshah A, Derrible S. 2016A geographical and multi-criteria vulnerability assessment of transportation networks against extreme earthquakes. Reliab. Eng. Syst. Saf. 153, 39–49. (doi:10.1016/j.ress.2016.04.007) Crossref, ISI, Google Scholar

    • 22

      Peiravian F, Derrible S, Ijaz F. 2014Development and application of the Pedestrian Environment Index (PEI). J. Transp. Geogr. 39, 73–84. (doi:10.1016/j.jtrangeo.2014.06.020) Crossref, ISI, Google Scholar

    • 23

      Ahmad N, Derrible S, Eason T, Cabezas H. 2016Using Fisher information to track stability in multivariate systems. R. Soc. open sci. 3, 160582. (doi:10.1098/rsos.160582) Link, ISI, Google Scholar

    • 24

      Karunanithi AT, Cabezas H, Frieden BR, Pawlowski CW. 2008Detection and assessment of ecosystem regime shifts from Fisher information. Ecol. Soc. 13, 22. (doi:10.5751/ES-02318-130122) Crossref, ISI, Google Scholar

    • 25

      Fisher RA. 1922On the mathematical foundations of theoretical statistics. Phil. Trans. R. Soc. Lond. A 222, 309–368. (doi:10.1098/rsta.1922.0009) Link, Google Scholar

    • 26

      Fath BD, Cabezas H, Pawlowski CW. 2003Regime changes in ecological systems: an information theory approach. J. Theor. Biol. 222, 517–530. (doi:10.1016/S0022-5193(03)00067-5) Crossref, PubMed, ISI, Google Scholar

    • 27

      Fath BD, Cabezas H. 2004Exergy and Fisher information as ecological indices. Ecol. Model. 174, 25–35. (doi:10.1016/j.ecolmodel.2003.12.045) Crossref, ISI, Google Scholar

    • 28

      Pawlowski C, Fath BD, Mayer AL, Cabezas H. 2005Towards a sustainability index using information theory. Energy 30, 1221–1231. (doi:10.1016/j.energy.2004.02.008) Crossref, ISI, Google Scholar

    • 29

      Mayer AL, Pawlowski CW, Cabezas H. 2006Fisher information and dynamic regime changes in ecological systems. Ecol. Model. 195, 72–82. (doi:10.1016/j.ecolmodel.2005.11.011) Crossref, ISI, Google Scholar

    • 30

      Federal Transit Administration. NDT data. https://www.transit.dot.gov/ntd/ntd-data (accessed 2 July 2015). Google Scholar

    • 31

      Eason T, Garmestani AS, Cabezas H. 2014Managing for resilience: early detection of regime shifts in complex systems. Clean Technol. Environ. Policy 16, 773–783. (doi:10.1007/s10098-013-0687-2) Crossref, ISI, Google Scholar

    • 32

      Ahmad N, Derrible S, Eason T, Cabezas H. csunlab/fisher-information. GitHub. See https://github.com/csunlab/fisher-information (accessed 13 April 2017). Google Scholar

    • 33

      Ahmad N, Derrible S, Eason T, Cabezas H. Fisher information. See http://csun.uic.edu/codes/fisher.html (accessed 13 April 2017). Google Scholar

    • 34

      US EPA. 2010San Luis Basin sustainability metrics project: a methodology for evaluating regional sustainability. Washington, DC: US Environmental Protection Agency, Office of Research and Development, National Risk Management Research Laboratory. Google Scholar

    • 35

      Cabezas H, Fath BD. 2002Towards a theory of sustainable systems. Fluid Phase Equilibria 194, 3–14. (doi:10.1016/S0378-3812(01)00677-X) Crossref, ISI, Google Scholar

    • 36

      Vuchic VR. 2005Urban transit: operations, planning and economics. Hoboken, NJ: J. Wiley & Sons. Google Scholar

    • 37

      Lapin LL. 1980Statistics: meaning and method. New York, NY: Harcourt Brace Jovanovich. Google Scholar

    • 38

      Mann HB. 1945Nonparametric tests against trend. Econ. J. Econ. Soc. 13, 245–259. Google Scholar

    • 39

      Kendall MG, Gibbons JD. 1990Rank correlation methods, 5th edn. New York, NY: Oxford University Press. Google Scholar

    • 40

      Ahmad N, Derrible S, Cabezas H. 2017Using Fisher information to assess stability in the performance of public transportation systems. See https://figshare.com/projects/Using_Fisher_Information_to_Assess_Stability_in_the_Performance_of_Public_Transportation_Systems/20041 (accessed 17 April 2017). Google Scholar


    Page 6

    Tumour antigens have attracted much attention for their importance in cancer diagnosis, prognosis and targeted therapy, as they are crucial tumour biomarkers for identifying tumour cells and are potential targets for cancer therapy [1–3]. Tumour antigens can be broadly classified into two categories based on their specificity: tumour-specific antigens, which are only present in tumour cells; and tumour-associated antigens, which are overexpressed or aberrantly expressed in tumour cells and are also expressed in some normal cells [1]. In addition to abnormal expression patterns, tumour cells also contain a range of cancer somatic mutations and mutations in protein-coding regions might produce tumour-specific mutant proteins [4,5]. Tumour antigens derived from these tumour-specific mutant proteins are unparalleled tumour biomarkers, as they are only produced by tumour cells and are potential tumour-specific mutant antigens or neoantigens [3].

    Tumour antigens recognized by T cells or antibodies should present on the surface of tumour cells [6,7]. A major part of tumour antigens used as drug targets are membrane proteins, such as HER2 and CD19, which are targets of the antibody trastuzumab [8] and chimaeric antigen receptor T-cell immunotherapy (CAR-T) for B-cell cancer [9,10], respectively. Additionally, tumour antigens presented by class I major histocompatibility complex (MHC) molecules for recognition by T cells (i.e. tumour-specific neoantigens) could also be used as drug targets [2,11,12]. On the other hand, in the immune checkpoint blockade therapy, the neoantigen load is associated with the therapy efficacy (i.e. PD-1, CTLA-4 blockade), which indicates that the neoantigen load is a great biomarker in cancer immunotherapy [13]. Because of their potential application to be targets and biomarkers in cancer immunotherapy [1,12,14,15], tumour-specific neoantigens have attracted much attention in biomedical research. Several prediction tools have been developed to predict tumour-specific neoantigens from cancer somatic mutations, such as pVAC-seq [16] and INTEGRATE-neo [17], which can predict neoantigens produced by non-synonymous somatic mutations and gene fusions, respectively. However, these tools only predict neoantigens presented by class I MHC molecules that can be recognized by T cells, they do not consider the mutations in the extracellular regions of membrane proteins that can be recognized by mutation-specific antibodies [18,19].

    In this study, we developed integrated software with a graphical user interface (GUI), called the tumour-specific neoantigen detector (TSNAD), which can identify cancer somatic mutations following the best practices of the genome analysis toolkit (GATK v. 3.5) [20] from the genome/exome sequencing data of tumour-normal pairs. We also provided a filter for calling tumour-specific mutant proteins. Then, we conducted two strategies to predict neoantigens. First, we extracted the extracellular mutations of membrane proteins according to the protein topology. Second, we invoked NetMHCpan (v. 2.8) [21] to predict the binding information of mutant peptides to class I MHC molecules. Finally, we applied TSNAD on the cancer somatic mutations collected in the International Cancer Genome Consortium (ICGC) database to predict potential neoantigens.

    Standard sequencing data processing consists of preprocessing, alignment, variants calling, annotation and further analysis. Given that the existing software or tools are designed for specific functions, it was necessary to develop an automated and user-friendly framework that calls a series of software. This section summarizes the required software and its main features.

    Trimmomatic (v. 0.35) [22]. Original raw sequences have random lengths and contain adaptors that will be harmful to the subsequent data processing. This software can trim and crop raw reads and remove artefacts.

    Burrows-Wheeler Aligner (BWA, v. 0.7.12) [23,24]. This alignment toolkit is used for mapping short sequences to a reference genome. This software is based on the Burrows-Wheeler transformation and is highly efficient at finding locations of low-divergent sequences on a large genome.

    Samtools (v. 1.3) [25]. Its view and sort functions transform sequencing data format from SAM (sequence alignment/map) to BAM (binary alignment/map), which will save an enormous amount of storage space. Moreover, it can manage duplicate reads and index alignments.

    Picard tools (v. 1.140) [26]. This program consists of a set of Java command lines to handle with different sequencing data format (such as SAM, BAM and VCF). Given redundancy data may influence further processing, Picard MarkDuplicates tool can thus be applied to remove repeat sequences.

    Genome Analysis Toolkit (GATK v. 3.5) [20], Mutect2 [27]. The main function of GATK is variant discovery in high-throughput sequencing data. Mutect2 is a package in GATK to identify somatic SNVs and INDELs.

    Annovar (14 December 2015) [28,29]. We use it to functional annotate somatic mutations, including position, change of nucleotide, change of amino acid for protein-coding region, and other functions. We can then extract tumour-specific mutant proteins.

    SOAP-HLA (v. 2.2) [30]. This software detects the human leucocyte antigen (HLA—the MHC in humans) types for each sample. The program takes sorted aligned sequencing data (BAM format) as the input and outputs HLA types. The HLA types are critical for the MHC-binding predictions.

    TMHMM (v. 2.0) [31]. This tool is used to predict the topology of membrane proteins based on a hidden Markov model (HMM). The prediction of transmembrane helices and membrane proteins is highly accurate [32].

    NetMHCpan (v. 2.8) [21]. This software can forecast peptides that can bind to MHC class I molecules using artificial neural networks.

    The somatic mutations were collected from the whole-genome/exome sequencing data of 9155 tumour-normal pairs in the ICGC database (Release 20, http://icgc.org). This dataset has compiled over 1.5 million sample somatic mutations in coding regions, among which 828 129 missense variants have caused amino acid changes with a frequency range from 1 to 476 out of 9155 tumour samples.

    The HLA types were extracted from the 1000 Genome Project. We choose 16 HLA alleles with frequencies of more than 5% in the population collected in the 1000 Genome Project [33], which includes five HLA-A (HLA-A*01:01, HLA-A*02:01, HLA-A*03:01, HLA-A*11:01 and HLA-A*24:02), four HLA-B (HLA-B*07:02, HLA-B*35:01, HLA-B*40:01 and HLA-B*51:01) and seven HLA-C (HLA-C*01:02, HLA-C*03:03, HLA-C*03:04, HLA-C*04:01, HLA-C*06:02, HLA-C*07:01 and HLA-C*07:02) alleles.

    The list of human membrane proteins was extracted from the human protein atlas [34]. The amino acid sequences of these membrane proteins were downloaded from Ensembl (GRCh37 v. 75) [35]. TMHMM (v. 2.0) was used to identify the transmembrane topology and extracellular region of each membrane protein [31].

    After we obtained the list of the tumour-specific mutant proteins, we extracted the peptide sequences around the mutation sites. As MHC molecules always bind to peptides 8–11 amino acids in length, we extracted peptides 21 amino acids in length, with 10 amino acids upstream and 10 amino acids downstream of mutation sites for NetMHCpan prediction (figure 1). Wild-type peptides with the same length as the mutant peptides were extracted as references. These wild-type and mutant peptides were measured for their binding affinities (50% inhibitory concentration [IC50], nM) to each class I HLA allele. The binding was considered strong if the IC50 value was less than 150 nM, and a weak binding had an IC50 value between 150 and 500 nM. Non-binding occurred if the IC50 value was more than 500 nM [11].

    What happens when a red blood cell dies?

    Figure 1. Mutant peptides with 21 amino acids and corresponding 8–11 mer peptides. MHC molecules always bind to 8–11 mer peptides, so we extracted peptides 21 amino acids in length, with 10 amino acids upstream and 10 amino acids downstream of mutation sites for NetMHCpan prediction. The number 11 in red indicates the mutated site, and the peptides in yellow represent all the possible peptides which may bind to MHC molecules.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Peptides were obtained lyophilized (more than 95% purity) from Bankpeptide Biological Technology Co., Ltd (Hefei, China), dissolved in 10% DMSO in sterile water and tested for sterility, purity, endotoxin and residual organics. Peptide binding to HLA-A*02:01 was determined by T2 assay [36]. T2 cells were washed in phosphate buffered saline (PBS) and RPMI-1640 without serum. In total, 5 × 105 cell ml–1 were incubated with 5 µg ml−1 peptide and 10 µg ml−1 human beta-2-microglobulin in serum-free RPMI-1640 for 4 h or overnight at 37°C. The pulsed cells were pelleted and followed by 3 × 1 ml rinses in PBS with centrifugation at 500g for 5 min at 4°C. Cells were resuspended in 200 µl PBS and stained with 1 µl of w6/32 (Thermo Fisher) for 30 min on ice, followed by three rinses with 1 ml PBS at 4°C. Cells were then resuspended in 200 µl PBS and 1 µl of goat anti-mouse antibody-FITC (Beyotime Biotechnology) for 30 min on ice, followed by three rinses at 4°C. Then, cells were resuspended in 500 µl PBS. Stained T2 cells were analysed using a FACSCalibur.

    We developed integrated software, called TSNAD, under the Linux operation system through a GUI. The platform is completely automated and is mainly designed for users who have little programming experience. There are several neoantigen prediction pipelines such as pVAC-seq, INTEGRATE-neo: pVAC-seq combined the tumour mutation and expression data to predict neoantigens by invoking NetMHC v. 3.4; INTEGRATE-neo was designed to predict neoantigens from fusion genes based on the pipeline INTERGRATE and NetMHC v. 4.0. Similar with these pipelines, TSNAD also used widely approved software NetMHCpan v. 2.8 to predict neoantigens. Compared with other neoantigen prediction pipelines, TSNAD has lists of advantages: first, TSNAD offered a pipeline for mutation calling from sequencing data; second, TSNAD not only considered the neoantigens presented by class I MHC molecules, but also took mutations in membrane proteins into consideration; third, unlike other pipelines that performed through command lines, TSNAD provided a GUI for biologists without programming background to analyse their data easily. The software consists of two toolkits: mutation detection and neoantigen prediction. Each toolkit is a two-step process as follows: configure the parameters and run the corresponding toolkit.

    The first step is to configure the software paths and parameters. This step is of great significance, and users are expected to ensure the appropriateness and correctness of the configurations. Users can find the detailed instructions about how to set paths and parameters in the user's manual. For the software paths, the users do not need to change these parameters once they are set because TSNAD will import the existing configuration files by default. Users can also edit partial parameters by GUI or by manually modifying the configuration files. It is worth noting that TSNAD requires its own naming convention for the input files. The users can choose to either manually or use the tool we provided to rename the names of sequencing files to suit the criteria of TSNAD.

    After setting the configurations, non-expert users can run the pipeline by just clicking on the appropriate toolbar. In the processing monitoring window, the users can observe the pipeline progression. The pipeline, which was written in Python programming language (v2.7), calls for standard third-party software and applies multiprocessing strategy to speed up the data processing.

    When the pipeline is finished, all of the results will be stored in a user-specified folder. The mutation detection pipeline returns the list of somatic mutations with annotations. The neoantigen prediction pipeline returns extracellular mutations of the membrane proteins and the MHC-binding information (all in TXT format).

    The software can detect single-nucleotide variants (SNVs) and small insertions or deletions (INDELs) according to the pipeline as depicted in figure 2. The raw paired-end sequence data were in FastQ format from the whole-genome sequencing, the whole-exome sequencing or the targeted gene panel sequencing using the Illumina platform. The raw data were cleaned using Trimmomatic [22]. BWA-MEM was used to map the reads to the reference genome sequences [23,24]. Samtools [25] and Picard [26] were used to address files in SAM or BAM formats, including transform, sort, merge and mark duplicates. GATK [20] was used to pre-process the BAM files, such as realigning the INDELs and recalibrating the bases. Mutect2 [27] in GATK was used to call the somatic SNVs and INDELs between tumour and normal samples. Annovar [28,29] was used to annotate the detailed mutation information.

    What happens when a red blood cell dies?

    Figure 2. The software pipeline of TSNAD. The pipeline performs best practices for somatic SNVs and INDELs in whole-genome/exome sequence with GATK. Then, we extracted the extracellular mutations of membrane proteins according to the protein topology, and invoked NetMHCpan to predict the binding information of mutant peptides to class I MHC molecules.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We further provide a filter to detect the somatic mutations in the protein-coding regions and the somatic missense variants which fit the cut-off (tumour reads > 10, normal reads > 6, tumour alteration reads > 5, variant allele frequency (VAF) in tumour DNA > 0.05 and VAF in normal DNA = 0).

    When peptides differ by only one amino acid change, specific antibodies can be generated [19,37]. Therefore, missense mutations that are present on the surfaces of tumour cells are important targets for antibody-based immunotherapy. We performed two strategies to predict the neoantigens that would present on the surfaces of tumour cells [1,2]. First, we extracted the somatic mutations in the extracellular regions of the membrane proteins. Second, we predicted the neoantigens that would present on the cell surface by evaluating the binding affinity between the peptides and class I MHC molecules.

    According to the Human Protein Atlas, there were 5462 predicted membrane proteins [34]. We identified the transmembrane topologies and the extracellular regions of these proteins using TMHMM [31]. To identify the extracellular mutations of membrane proteins, the filtered cancer somatic missense variants were mapped to the extracellular regions of membrane proteins. We further verified the characteristics of the mutant amino acids. Mutations that change the polarity of the amino acids have gained more attention, as they may be more likely to cause differences in binding features to antibodies between wild-type and mutant proteins.

    In addition to the membrane proteins, peptides could be present on the cell surface because of the antigen presenting system, which is mediated by MHC molecules. SOAP-HLA was used for the HLA typing of each sample [30]. NetMHCpan was used to predict the binding affinity between the class I MHC and wild-type/mutant peptides [21]. We further compared the binding information of the HLA molecules to the wild-type and mutant peptides. The mutant peptides that can bind to the HLA-A/B/C molecules were extracted for further analysis; the specific bindings of the HLA proteins to the mutant peptides were preferred for their potential to be drug targets without affecting normal tissues.

    In previous study, we performed oncogene targeted depth sequencing on a malignant peritoneal mesothelioma [38]. Applying the TSNAD to analyse the sequence data of the tumour sample and the paired peripheral blood sample, we detected 2897 somatic SNVs and 218 somatic INDELs. Four SNVs of NOTCH2, PDE4DIP, ATP10B and NSD1 and one frameshift INDEL of BAP1 were validated by Sanger sequencing on tumour RNA. We also predicted the neoantigens on these mutated proteins, and found specific-binding of neo-peptide generated by BAP1 frameshift INDEL to HLA-B*35:42 of the patient. A polyclonal antibody of the neo-peptide of BAP1 were produced in rabbits and showed a good antibody-neoantigen specificity, which indicates that the neo-peptide of BAP1 could be a potential tumour-specific neoantigen [38].

    In addition to handling original sequencing data, TSNAD could also analyse exiting mutations data to predict potential neoantigens. We applied TSNAD to the simple somatic mutations of 9155 samples from the ICGC database and predicted numerous neoantigens, including extracellular mutations of membrane proteins and peptides presented by the class I MHC molecules.

    To identify the extracellular mutations of membrane proteins, we mapped all of the missense mutations to the extracellular regions of membrane proteins. A dataset containing 88 354 extracellular mutations was obtained. A majority of these extracellular mutations (89.6%, 79 198 out of 88 354) occurs only once in the 9155 donors (electronic supplementary material, table S1 and figure S1), which illustrates the high heterogeneity in tumour samples. However, membrane proteins with mutations that occur in more samples are also ideal drug targets for antibody-based immunotherapy. The top 20 frequent extracellular mutations are listed in table 1 and MUC4:H4205Q is the most frequent extracellular mutation (44 out of 9155).

    Table 1.Top 20 most frequent extracellular mutations in 9155 donors.

    ChrPosIDgeneDNA mutationprotein mutationmutation frequency
    3195505836MU10935MUC412615C>GH4205Q44 out of 9155
    129138975MU68226OPRD180G>TC27F25 out of 9155
    1120611960MU869951NOTCH261G>AA21T23 out of 9155
    191065018MU68245ABCA76133G>TA2045S18 out of 9155
    1522369378MU4380351OR4M2803C>TS268F16 out of 9155
    1737868208MU85975ERBB2929C>TS310F15 out of 9155
    3195509676MU586249MUC48775G>CQ2925H15 out of 9155
    755233043MU589341EGFR1793G>TG598 V15 out of 9155
    3195515449MU605883MUC43002T>AV1001E14 out of 9155
    199072091MU4382243MUC1615355C>TP5119S13 out of 9155
    2017639816MU4585427RRBP11337A>CQ446P12 out of 9155
    5179071958MU4110168C5orf6064G>CD22H12 out of 9155
    7146829338MU4413315CNTNAP21085G>AG362E12 out of 9155
    1158261127MU4408485CD1C265C>TR89C11 out of 9155
    115345040MU4383907OR51B2488C>TS163 L11 out of 9155
    1721319519MU613603KCNJ12865G>CE289Q11 out of 9155
    246707884MU70561TMEM247458A>GQ153R11 out of 9155
    2137814319MU4440003THSD7B469G>AE157 K11 out of 9155
    3195511286MU4617526MUC47165G>AD2389N11 out of 9155
    7139167934MU66261KLRG2455A>CK152T11 out of 9155

    Peptides could also present on the cell surface via the antigen presenting system, mediated by MHC class I molecules. In this manner, mutant peptides that are present exclusively in tumour cells are the potential neoantigens, and the MHC–peptide complexes are called neoantigens.

    Based on the missense mutations of the 9155 tumour samples from the ICGC, we extracted peptides 21 amino acids length, with 10 amino acids upstream and 10 amino acids downstream of the mutation sites. Both the mutant and reference peptides were extracted. Combined with the 16 HLA alleles whose frequencies were more than 5% in the population collected in the 1000 Genome Project, we used our software, invoking NetMHCpan (v2.8) [21] to predict the binding affinity between HLA and the collected peptides. Then, we compared the binding information of the HLA proteins to wild-type and mutant peptides, and the specific bindings of the HLA proteins to mutant peptides were collected. These mutant peptides are seen as potential neoantigens. Finally, we obtained a dataset containing 1 420 785 records. We also analysed the distribution of the dataset (electronic supplementary material, table S2 and figure S2). The results showed a similar phenomenon with that in membrane proteins.

    Mutations with more frequencies in the samples may play important roles in tumorigenesis. There are 65 potential common neoantigens whose corresponding mutations appear in at least 20 out of the 9155 donors from the ICGC database and had an IC50 of less than 500. The 65 neoantigens are related to the 23 somatic mutations of 12 genes (table 2; electronic supplementary material, table S3). KRAS, PIK3CA and TP53 occupy more potential neoantigens than other genes, indicating that these genes play more important roles in tumour immunotherapy, corresponding to former research results that KRAS and PIK3CA are oncogenes and that TP53 is a tumour suppressor gene [39]. Moreover, we also found some genes that have not been identified as tumour-associated genes by Cancer Gene Census also encode potential neoantigens, such as MUC4, FAM194B, OPRD1 and FRG1.

    Table 2.Sixty five potential common neoantigens and their corresponding genes and mutation frequency.

    generole in tumourno. mutationno. neoantigen
    KRASoncogene511
    PIK3CAoncogene521
    TP53tumour suppressor gene38
    SF3B1tumour-related gene12
    MUC411
    CHEK2tumour suppressor gene12
    PTENtumour suppressor gene23
    FAM194B12
    OPRD115
    CTNNB1oncogene15
    FRG114
    GNAStumour-related gene11

    We found that the most frequent potential neoantigens are encoded by gene KRAS, which has been identified as an oncogene in vivo. There are six potential neoantigens related to the KRAS gene in the top 10 potential neoantigens, with two different mutations: G12D and G12 V. Among the six peptides, three of them (KLVVVGADGV, KLVVVGAVGV and KLVVVGAV) are presented by HLA-A*02:01, one (TEYKLVVVGAV) is presented by HLA-A*40:01, one (GAVGVGKSAL) is presented by HLA-A*03:04 and one (GAVGVGKSAL) is presented by HLA-C*03:03 (table 3).

    Table 3.Top 10 neoantigens with the highest mutation frequency in 9155 donors.

    geneHLA allelepositionpeptidemutationaffinity (nM)mutation frequency
    KRASHLA-A*02:018KLVVVGADGVG12D214322 out of 9155
    KRASHLA-A*02:018KLVVVGAVGVG12 V112239 out of 9155
    KRASHLA-A*02:018KLVVVGAVG12 V163239 out of 9155
    KRASHLA-B*40:0111TEYKLVVVGAVG12 V90239 out of 9155
    KRASHLA-C*03:043GAVGVGKSALG12 V172239 out of 9155
    KRASHLA-C*03:033GAVGVGKSALG12 V172239 out of 9155
    PIK3CAHLA-C*07:022ARHGGWTTKMH1047R218200 out of 9155
    PIK3CAHLA-C*06:023ARHGGWTTKMH1047R457200 out of 9155
    PIK3CAHLA-C*07:012ARHGGWTTKMH1047R249200 out of 9155
    PIK3CAHLA-A*11:0111STRDPLSEITKE545 K81182 out of 9155

    To study the distribution of the neoantigens across different HLA type, we classified the 1 420 785 records into 16 parts according to the HLA type we used (figure 3a). It was found that approximately 10 mutant peptides could bind to each HLA type in each sample, which means that we can find about 60 neoantigens in each tumour sample on average.

    What happens when a red blood cell dies?

    Figure 3. The distribution of tumour-specific neoantigens across 16 HLA types and 20 tumour types. (a) The number of tumour-specific neoantigens with each HLA type is shown in decreasing order. The dashed line indicates the average number of neoantigens. (b) Distribution of tumour-specific neoantigens across 20 tumour types. The width of each violin indicates the proportion of donors sharing a certain number of neoantigens in each tumour type. Upper limit and lower limit of white bar and the black line in it denote upper quartile, lower quartile and median number for each type.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Because of the highly heterogeneity of tumours, we further investigated the distribution of neoantigens in each tumour type (based on the tissue origin; figure 3b). The results showed that the neoantigen load is related to the somatic mutation burden. The cancer types have more mutation load, such as skin and lung cancer, have more neoantigens in average. Interestingly, uterus cancer has the largest number of neoantigens on average (715.98, electronic supplementary material, table S4), but the median number of neoantigens of uterus cancer ranks 10th among the 20 cancer types (figure 3b). The reason may be that the number of neoantigens varies greatly among different patients of uterus cancer, several uterus tumours have large numbers of neoantigens. The nervous system cancer possesses the least neoantigens (2.39) on average. The results indicated that the neoantigen load is not only quite different between different cancer types, but also quite different between different tumours from the same tissue.

    We choose one of the 65 potential common neoantigens, which was generated by the TP53 R248 W mutation, to experimentally confirm the specific-binding of neoantigen to HLA-A*02:01 using T2 assay [36]. We predicted that the wild-type (WT) peptide (GMNRRPILTII) could not bind to HLA-A*02:01, while the mutant peptide (GMNWRPILTII) could weakly bind to HLA-A*02:01 with the IC50 value = 350 nM (electronic supplementary material, table S3). The T2 cell line was widely used to confirm the binding of the peptides to HLA-A*02:01 as its HLA levels can be stabilized by the addition of exogenous HLA-binding peptides but unable to present the endogenous HLA-associated peptides [15,36]. To assess binding strength, we first incubated T2 cells with the WT and mutant peptides, respectively, and then used the W6/32 antibody that targets HLA molecules stabilized by any HLA-binding peptides. The strength of peptide binding between WT and mutant peptides were comparable as suggested by W6/32 staining. Analysis of the pulsed cells by flow cytometry showed that binding of the TP53 (R248 W) mutant peptide to T2 cells was more significant than the background levels of staining to the WT peptide or negative control cells (figure 4), which confirms the specific binding of the TP53 mutant peptide to HLA-A*02:01. Therefore, the R248 W mutation of TP53 can generate a potential tumour-specific neoantigen in the patient with HLA-A*02:01, which can be an ideal target for neoantigen-specific cancer immunotherapy.

    What happens when a red blood cell dies?

    Figure 4. Specific binding of mutant peptide of TP53 to HLA-A*02:01. Blank control: FITC-goat anti-mouse IgG + T2 cells; negative control: human beta-2-microglobulin were incubated with T2 cells overnight at 37°C + W6/32 + FITC-goat anti-mouse IgG; wide-type (WT) peptide binding affinity analysis: WT peptide (GMNRRPILTII) and human beta-2-microglobulin were incubated with T2 cells overnight at 37°C + W6/32 + FITC-goat anti-mouse IgG; mutated peptide binding affinity analysis: mutated peptide (GMNWRPILTII) and human beta-2-microglobulin were incubated with T2 cells overnight at 37°C + W6/32 + FITC-goat anti-mouse IgG.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    TSNAD is a tool for detecting cancer somatic mutations following the best practices of GATK [20]. TSNAD can also provide potential neoantigens [1], which can be either extracellular mutations of membrane proteins or mutant peptides presented by class I MHC molecules. It is critical for biologists without programming background. We applied the antigen-predicting tool of TSNAD to predict neoantigens, including extracellular mutations of membrane proteins and neoantigens presented by MHC class I molecules. And we experimentally verified the specific-binding of the mutated peptide of TP53 we predicted (R248 W, wild-type: GMNRRPILTII, mutant: GMNWRPILTII) to HLA-A*02:01. The predicted neoantigens in our study were important sources for selecting suitable drug targets. In further study, these predicted neoantigens would need more experimental validation for their potential to be employed as drug targets of T cell or antibody-based immunotherapy.

    The software and codes are freely available from https://github.com/jiujiezz/tsnad and the predicted neoantigens are freely available from http://biopharm.zju.edu.cn/lab/database/tsnadb.

    Z.Z., Z.X.S., X.G. and S.Q.C. designed and directed the research; Z.Z., X.Z.L. wrote the programs; Z.Z., J.C.W., S.S.W. and J.Z. performed data analysis; X.Y.Y. performed experimental validation. Z.Z., X.Z.L., J.C.W. and Z.X.S. wrote the manuscript. All authors reviewed the manuscript.

    We declare that we have no competing interests.

    This work was supported by grants from the National Natural Science Foundation of China (31501021 and 81430081), the Zhejiang Provincial Natural Sciences Foundation of China (LY15C060001), the Fundamental Research Funds for the Central Universities and the State Key Laboratory of Genetic Engineering at Fudan University.

    We would like to thank Dr Binbin Zhou from Zhejiang University for her help with programming. We also gratefully acknowledge the clinical contributors and the data producers from the International Cancer Genome Consortium (ICGC) for referencing the ICGC datasets.

    Footnotes

    Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.3721813.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Ilyas S, Yang JC. 2015Landscape of tumor antigens in T cell immunotherapy. J. Immunol.. 195, 5117–5122. (doi:10.4049/jimmunol.1501657) Crossref, PubMed, ISI, Google Scholar

    • 2

      Robbins PFet al.2013Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat. Med. 19, 747–752. (doi:10.1038/nm.3161) Crossref, PubMed, ISI, Google Scholar

    • 3

      Wang Qet al.2011Mutant proteins as cancer-specific biomarkers. Proc. Natl Acad. Sci. USA 108, 2444–2449. (doi:10.1073/pnas.1019203108) Crossref, PubMed, ISI, Google Scholar

    • 4

      Stratton MR, Campbell PJ, Futreal PA. 2009The cancer genome. Nature 458, 719–724. (doi:10.1038/nature07943) Crossref, PubMed, ISI, Google Scholar

    • 5

      Alexandrov LBet al.2013Signatures of mutational processes in human cancer. Nature 500, 415–421. (doi:10.1038/nature12477) Crossref, PubMed, ISI, Google Scholar

    • 6

      Baldwin RW, Embleton MJ, Price MR. 1983Monoclonal antibody-defined antigens on tumor cells. Biomembranes 11, 285–312. PubMed, Google Scholar

    • 8

      Hudis CA. 2007Trastuzumab--mechanism of action and use in clinical practice. N. Engl. J. Med. 357, 39–51. (doi:10.1056/NEJMra043186) Crossref, PubMed, ISI, Google Scholar

    • 9

      Lee DW, Barrett DM, Mackall C, Orentas R, Grupp SA. 2012The future is now: chimeric antigen receptors as new targeted therapies for childhood cancer. Clin. Cancer Res. 18, 2780–2790. (doi:10.1158/1078-0432.CCR-11-1920) Crossref, PubMed, ISI, Google Scholar

    • 10

      Kochenderfer JN, Rosenberg SA. 2013Treating B-cell cancer with T cells expressing anti-CD19 chimeric antigen receptors. Nat. Rev. Clin. Oncol. 10, 267–276. (doi:10.1038/nrclinonc.2013.46) Crossref, PubMed, ISI, Google Scholar

    • 11

      Rajasagi Met al.2014Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood 124, 453–462. (doi:10.1182/blood-2014-04-567933) Crossref, PubMed, ISI, Google Scholar

    • 12

      Schumacher TN, Schreiber RD. 2015Neoantigens in cancer immunotherapy. Science 348, 69–74. (doi:10.1126/science.aaa4971) Crossref, PubMed, ISI, Google Scholar

    • 13

      Sharma P, Allison JP. 2015The future of immune checkpoint therapy. Science 348, 56–61. (doi:10.1126/science.aaa8172) Crossref, PubMed, ISI, Google Scholar

    • 14

      Desrichard A, Snyder A, Chan TA. 2016Cancer neoantigens and applications for immunotherapy. Clin. Cancer Res. 22, 807–812. (doi:10.1158/1078-0432.CCR-14-3175) Crossref, PubMed, ISI, Google Scholar

    • 15

      Carreno BMet al.2015Cancer immunotherapy. A dendritic cell vaccine increases the breadth and diversity of melanoma neoantigen-specific T cells. Science 348, 803–808. (doi:10.1126/science.aaa3828) Crossref, PubMed, ISI, Google Scholar

    • 16

      Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, Griffith M. 2016pVAC-Seq: a genome-guided in silico approach to identifying tumor neoantigens. Genome Med. 8, 11. (doi:10.1186/s13073-016-0264-5) Crossref, PubMed, ISI, Google Scholar

    • 17

      Zhang J, Mardis ER, Maher CA. 2016INTEGRATE-Neo: a pipeline for personalized gene fusion neoantigen discovery. Bioinformatics 32, 511–517. (doi:10.1093/bioinformatics/btw674) PubMed, ISI, Google Scholar

    • 18

      Becker KFet al.1999Analysis of E-cadherin in diffuse-type gastric cancer using a mutation-specific monoclonal antibody. Am. J. Pathol. 155, 1803–1809. (doi:10.1016/S0002-9440(10)65497-1) Crossref, PubMed, ISI, Google Scholar

    • 19

      Yu Jet al.2009Mutation-specific antibodies for the detection of EGFR mutations in non-small-cell lung cancer. Clin. Cancer Res. 15, 3023–3028. (doi:10.1158/1078-0432.CCR-08-2739) Crossref, PubMed, ISI, Google Scholar

    • 20

      McKenna Aet al.2010The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303. (doi:10.1101/gr.107524.110) Crossref, PubMed, ISI, Google Scholar

    • 21

      Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, Buus S, Nielsen M. 2009NetMHCpan, a method for MHC class I binding prediction beyond humans. Immunogenetics 61, 1–13. (doi:10.1007/s00251-008-0341-z) Crossref, PubMed, ISI, Google Scholar

    • 22

      Bolger AM, Lohse M, Usadel B. 2014Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120. (doi:10.1093/bioinformatics/btu170) Crossref, PubMed, ISI, Google Scholar

    • 23

      Li H, Durbin R. 2010Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595. (doi:10.1093/bioinformatics/btp698) Crossref, PubMed, ISI, Google Scholar

    • 24

      Li H, Durbin R. 2009Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760. (doi:10.1093/bioinformatics/btp324) Crossref, PubMed, ISI, Google Scholar

    • 25

      Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079. (doi:10.1093/bioinformatics/btp352) Crossref, PubMed, ISI, Google Scholar

    • 26

      DePristo MAet al.2011A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498. (doi:10.1038/ng.806) Crossref, PubMed, ISI, Google Scholar

    • 27

      Cibulskis Ket al.. 2013Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219. (doi:10.1038/nbt.2514) Crossref, PubMed, ISI, Google Scholar

    • 28

      Yang H, Wang K. 2015Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat. Protoc. 10, 1556–1566. (doi:10.1038/nprot.2015.105) Crossref, PubMed, ISI, Google Scholar

    • 29

      Wang K, Li M, Hakonarson H. 2010ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164. (doi:10.1093/nar/gkq603) Crossref, PubMed, ISI, Google Scholar

    • 30

      Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J. 2009SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967. (doi:10.1093/bioinformatics/btp336) Crossref, PubMed, ISI, Google Scholar

    • 31

      Sonnhammer EL, von Heijne G, Krogh A. 1998A hidden Markov model for predicting transmembrane helices in protein sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 6, 175–182. PubMed, Google Scholar

    • 32

      Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580. (doi:10.1006/jmbi.2000.4315) Crossref, PubMed, ISI, Google Scholar

    • 33

      Gourraud PA, Khankhanian P, Cereb N, Yang SY, Feolo M, Maiers M, Rioux JD, Hauser S, Oksenberg J. 2014HLA diversity in the 1000 genomes dataset. PLoS ONE 9, e97282. (doi:10.1371/journal.pone.0097282) Crossref, PubMed, ISI, Google Scholar

    • 34

      Uhlen Met al.2015Proteomics. Tissue-based map of the human proteome. Science 347, 1260419. (doi:10.1126/science.1260419) Crossref, PubMed, ISI, Google Scholar

    • 35

      Yates Aet al.2016Ensembl 2016. Nucleic Acids Res. 44, D710–D716. (doi:10.1093/nar/gkv1157) Crossref, PubMed, ISI, Google Scholar

    • 36

      Elvin J, Potter C, Elliott T, Cerundolo V, Townsend A. 1993A method to quantify binding of unlabeled peptides to class I MHC molecules and detect their allele specificity. J. Immunol. Methods 158161–171. Crossref, PubMed, ISI, Google Scholar

    • 37

      Skora ADet al.2015Generation of MANAbodies specific to HLA-restricted epitopes encoded by somatically mutated genes. Proc. Natl Acad. Sci. USA 112, 9967–9972. (doi:10.1073/pnas.1511996112) Crossref, PubMed, ISI, Google Scholar

    • 38

      Lai J, Zhou Z, Tang XJ, Gao ZB, Zhou J, Chen SQ. 2016A tumor-specific neo-antigen caused by a frameshift mutation in BAP1 is a potential personalized biomarker in malignant peritoneal mesothelioma. Int. J. Mol. Sci. 17, 739. (doi:10.3390/ijms17050739) Crossref, ISI, Google Scholar

    • 39

      Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. 2004A census of human cancer genes. Nat. Rev. Cancer. 4, 177–183. (doi:10.1038/nrc1299) Crossref, PubMed, ISI, Google Scholar


    Page 7

    A promising route in the fight against major disease, such as malaria [1,2], SARS [3], influenza [4], HIV [5] and toxoplasmosis [6], is a novel family of nanoparticle-based vaccines [7,8]. They rely on a special class of self-assembling protein nanoparticles (called SAPNs) that form from multiple copies of a purpose-designed protein chain, functionalized to present epitope antigens on the particle surface. Other approaches to design protein-based nanoparticulate systems have been published by various research groups [9,10]. The architecture of such designs have been described with high accuracy [11,12]. A major challenge in the rational design of such SAPNs lies in the control of their surface structures, as building blocks can self-assemble into a spectrum of different particle morphologies. Starting with the work of Raman et al. [13], several SAPN species have been synthesized, but their structures have not been completely determined in most cases, and nanoparticle populations are usually characterized in terms of the diameter of the particles only. In some studies, the numbers of the protein chains composing the particle have been identified. For example, Kaba et al. [1] and Raman et al. [13] report particles corresponding to assemblies of 60 chains; Pimentel et al. [3] describe SAPNs with 120 chains; Yang et al. [14] discuss species made of 180 and 300 chains; and finally, Indelicato et al. [15] report assemblies of 240, 300, 360 chains. Also smaller assemblies, so-called LCM units containing 15 protein chains have been discussed and reported [13,16]. However, an exhaustive enumeration of all possible nanoparticle morphologies that can arise from multiple copies of a given type of building block is currently lacking. This presents a bottleneck in the prediction of the display of B-cell epitopes on the surface of the SAPNs to render them optimal repetitive antigen display systems.

    The challenge of enumerating all possible SAPN geometries is reminiscent of the one faced in the classification of virus structures. Similar to SAPNs, viruses assemble the protein containers that encapsulate their genomes (viral capsids) from multiple copies of a small number of different capsid proteins, in many cases a single type of capsid protein. These proteins typically group together in clusters of two, three, five or six in the capsid surface, akin to the clusters seen in SAPN architectures. Caspar & Klug’s seminal classification scheme of viral architectures [17] relies on a geometric approach, predicting the spectrum of possible virus architectures in terms of the numbers and relative positions of these protein clusters (capsomeres) with reference to spherical surface lattices. This classification has revolutionized our understanding of virus structure, and plays a key role in the interpretation of experimental data in virology. This classification of virus architectures has been developed for particles with icosahedral symmetry and, as such, can be used also for synthetic vaccines based on virus-like particles, but is not suitable to model SAPNs.

    We develop here a classification scheme for SAPN morphologies in terms of surface tessellations and associated graphs that pinpoint the positions of the protein building blocks in the particle surfaces. Our approach exploits the geometric relation of SAPN morphologies with fullerene architecture, and further develops tools that have been introduced for fullerene classification. As a result, we present a procedure to classify SAPN morphologies both symmetric and asymmetric, and we deliver a classification for high and low symmetry particles seen in the experiments. In particular, we explicitly determine particle morphologies for symmetric particles formed from up to 360 protein building blocks, as there is experimental evidence that spherical particles up to this size should exist, and these are relevant for vaccine design [1,3,14,15]. Defective nanoparticles are not considered in this work as they require a different mathematical model, and will be the object of future investigation.

    SAPNs are formed from multiple copies of a single protein building block (PBB) that is designed to self-assemble into particles via formation of specific cluster types. We focus here on SAPNs used in vaccine design, with PBBs given by pairs of linked helices (figure 1a). These are designed to interact via formation of trimeric and pentameric coiled coils involving, respectively, three (blue) and five (green) helices of different PBBs. SAPN architectures are thus characterized by the numbers and positions of these threefold and fivefold clusters.

    What happens when a red blood cell dies?

    Figure 1. SAPN architecture and nanoparticle graphs. (a) The SAPN building blocks consist of two fused polypeptide helices, that cluster in groups of three (black sphere) and five (white sphere) in the nanoparticle shell. (b) Nanoparticle graphs correspond to spherical tessellations in terms of rhombs and hexagons, with vertices labelled alternatingly by black and white spheres. (c) A SAPN formed from 180 PBBs, together with its nanoparticle graph (adapted from a figure by N. Wahome and P. Burkhard). The nanoparticle model was built using a variety of adapted tools from the CCP4 program suite (www.ccp4.ac.uk/), the modelling software O (xray.bmc.uu.se/) and data from the RCSB database (www.rcsb.org/). The nanoparticle graph has been obtained by modifying a fullerene graph of the library of the Fullerene Program [18].

    • Download figure
    • Open in new tab
    • Download PowerPoint

    As the trimeric and pentameric coiled coils are connected in the PBBs, SAPNs can be represented as spherical graphs in which vertices mark trimer (black spheres in figure 1) and pentamer (white spheres) positions, and edges represent the PBBs connecting them. We refer to these graphs as nanoparticle graphs. In vaccine design, the PBB helices are functionalized, e.g. via an extension of the trimer-forming helices by viral epitopes as in the case of the SARS HRC1 [3]. Information on the positions of the trimeric coiled coils therefore provides insights into epitope location in the nanoparticle surface. For example, figure 1c illustrates how nanoparticle graphs translate into SAPN morphologies, based on the example of a particle formed from 180 PBBs. It has 36 pentameric and 60 trimeric clusters, with epitope positions marked by black spheres. A classification of nanoparticle graphs thus provides an atlas of SAPN geometries and epitope positions.

    By construction, nanoparticle graphs have two types of vertices, V3 and V5, in which, respectively, precisely three or five edges meet. From a mathematical point of view, they are bipartite, (3,5)-regular spherical graphs. Such graphs can be viewed as spherical surface tessellations (tilings) in terms of shapes that have an even number of edges connecting, alternatingly, vertices from V3 and V5. For the sake of simplicity, we focus our analysis on tessellations in terms of hexagons and rhombs (i.e. the shapes with the smallest number of edges) with edges alternatingly marked via black and white spheres along their boundaries (figure 1b).

    As each PBB corresponds to an edge in the nanoparticle graph, connecting a trimeric coiled coil (a vertex from V3) with a pentameric coiled coil (a vertex from V5), the number N of its edges must satisfy N=3|V3|=5|V5|. This results in the restriction

    N=15m,|V3|=5m,|V5|=3m,

    with m∈N, implying that the number of PBBs in any particle must be a multiple of 15.

    For a nanoparticle graph with N=15m chains, Euler’s formula f=2−v+e relates the numbers of vertices v=|V3|+|V5|=8m, edges e and faces f of the corresponding spherical tiling. Using the fact that edges fulfil the condition 4r+6x=2e=2N=30m, with r and x denoting the number of rhombs and hexagons, respectively, one obtains

    r=6(m+1),x=m−4.

    As the number of hexagons must be zero or larger, this implies m≥4, and the nanoparticle with N=60 is thus the smallest possible option. Its nanoparticle graph corresponds to a rhombic triacontahedron, i.e. an icosahedrally symmetric polyhedron with 30 rhombic faces, 60 edges, 12 fivefold vertices, and 20 threefold vertices.

    An exhaustive enumeration of nanoparticle graphs is a combinatorial challenge. We introduce here a method that relates SAPN geometries with those of fullerene cages, i.e. three-coordinated cages with vertices formed from carbon atoms. From a mathematical point of view, fullerenes correspond to three-regular spherical graphs with 12 pentagonal and otherwise hexagonal faces, and their geometries have been classified previously [18–20]. Using the method presented below, this classification of fullerene graphs can be used to derive a classification of SAPNs in terms of nanoparticle graphs.

    From nanoparticles to fullerenes. To any nanoparticle graph N with isolated hexagons, i.e. in which hexagonal tiles do not share a vertex, a unique fullerene graph F can be associated via the following vertex addition rule (figure 2). In step one, a trimer is added at the centre of every hexagonal face and is connected to the white vertices (pentamers) on its boundary, resulting in a tessellation in terms of rhombs (graph N′). In step two, every pair of black vertices (trimers) on the boundary of the same rhomb is connected along a diagonal of the rhomb. In step three, vertices from V5 (white) and all edges of N′ are removed. The remaining vertices V ′3, given by the union of V3 (black vertices) and the (red) vertices added in step one, and their connections via the edges added in step two, define the fullerene graph F. The vertex addition rule relates the number of vertices, edges and faces of a nanoparticle graph with that of its fullerene graph counterpart according to table 1.

    What happens when a red blood cell dies?

    Figure 2. The vertex addition rule for the construction of the fullerene equivalent of a nanoparticle graph. (a) A portion of a nanoparticle graph; (b) the vertex addition rule adds additional three-coordinated vertices (red) at the centres of the hexagonal faces with connections to the five-coordinated vertices (white) of the nanoparticle graph; (c) pairs of trimers belonging to the same rhomb are connected by a dashed line; (d) removal of all edges of the nanoparticle graph in (a) results in a fullerene graph; (e) the nanoparticle graph for a particle with N=180 PBBs; and (f) the fullerene graph C68 (Td) obtained via the vertex addition rule, with red points representing the trimers added to the nanoparticle graph in the procedure.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 1.Relationship between a nanoparticle graph and its associated fullerene graph.

    nanoparticle Nnanoparticle N′fullerene F
    edges15m18m−129m−6
    degree-3 vertices5m6m−46m−4
    degree-5 vertices3m12
    degree-6 vertices3m−12
    rhombic faces6(m+1)9m−6
    hexagonal facesm−43m−12
    pentagonal faces12

    From fullerenes to nanoparticles. The above procedure is not always reversible. Reversal would require completion of the following three steps. In step one, the set V5 of the nanoparticle graph is constructed by placing a vertex at the centre of each face of the fullerene graph F, i.e. by adding the vertices of the dual graph of F to the vertices V ′3 of the fullerene graph. In step two, each such vertex is connected to those vertices from V ′3 that are located on the same face, and all edges of the fullerene graph are removed. This yields a bipartite graph N′ with vertices of degree 3 (V ′3) and vertices of degree either 5 or 6 (V5). Finally, in order to obtain a nanoparticle graph N, removal of vertices from V ′3 is required so that all vertices in V5 have degree 5. This requires eliminating (colouring) of exactly one vertex of the fullerene graph F for each hexagonal face, and none from the pentagonal faces, which we will refer to as the vertex colouring rule in the following. Such colouring may not be possible or not be unique. A necessary condition for a fullerene graph to result in a nanoparticle graph with N=15m edges via the vertex colouring rule is that the fullerene graphs must have 6m−4 vertices, corresponding to the sum of the number of vertices V3 and hexagonal faces x of the nanoparticle graph. We will therefore in the following classify SAPN morphologies, either symmetric or not, starting with fullerene graphs C6m−4, that have been classified previously [18–20].

    Fullerene cages can have varying degrees of symmetry, including the icosahedral symmetry of the Buckminster fullerene and carbon onions, the lower dihedral symmetries of prolate architectures, and the asymmetric shapes of fullerene cones. Similarly, nanoparticle graphs and SAPNs can vary in symmetry. We start with a classification of nanoparticle graphs with non-planar symmetries, i.e. those with at least two different types of symmetry axes. Note that fourfold symmetry axes cannot occur. This is because vertices of nanoparticle graphs cannot occupy fourfold axes, and octagonal faces are excluded. Therefore, icosahedral and tetrahedral symmetries are the only possible non-planar options.

    Symmetry imposes strong restrictions on the number N of edges of the nanoparticle graph, so that only particles with certain numbers of PBBs are allowed. In order to construct the nanoparticle graphs for these cases explicitly, we adapt methods used previously in the context of fullerene architecture. In particular, for the modelling of the icosahedrally symmetric nanoparticle graphs we adapt the Goldberg–Coxeter procedure [21,22], and for the tetrahedral graphs we use its extension to tetrahedral symmetry by Fowler et al. [23]. In each case, we first construct the fullerenes with required symmetry and number of edges, and then derive the corresponding nanoparticle graphs via the vertex colouring rule in figure 2.

    We first derive restrictions owing to symmetry. Consider the icosahedral group I acting on the nanoparticle graph (embedded into a sphere). Denote by td and pd the number of trimers and pentamers in generic positions in the fundamental domain, i.e. those not positioned on symmetry axes of the particle. Then, for the particle to have icosahedral symmetry, the following relationship has to be fulfilled:

    N=3(20α+60td)⏟total numberof trimers=5(12γ+60pd)⏟total numberof pentamers.

    Here, α=1 or 0 indicates the presence or absence of trimers on the threefold axes of icosahedral symmetry, and γ=1 or 0 of pentamers on the fivefold axes, respectively. Note that, as I is a subgroup of the full icosahedral group Ih, this restriction also holds for nanoparticles with full icosahedral symmetry. There are only two solutions up to N=360, given by N=60 and N=360.

    We use the Goldberg–Coxeter construction for fullerenes to determine the corresponding nanoparticle graphs. In this construction, a fullerene graph is represented as a superposition of an icosahedral surface (20 equilateral triangular faces) on a planar hexagonal grid such that the icosahedral vertices coincide with centres of the hexagonal tiles (figure 3). The positions of the carbon atoms in the fullerene then correspond to the vertices of the hexagonal tiles that overlap with the embedded icosahedral surface. Denoting one of the icosahedral vertices as O, the construction is fully determined by specifying the location of a second vertex P on the same triangular face in terms of integer coordinates (i,j) in the hexagonal lattice basis e1 and e2. The equilateral triangle with P=(1,0) by construction contains only one vertex of the fullerene graph, i.e. one carbon atom. Denoting the area of this triangle as Δ, then an equilateral triangle with vertices at (0,0) and P=(i,j) has area (i2+ij+j2)Δ, and therefore contains i2+ij+j2 vertices of the fullerene. Given that the planar net of the icosahedron contains 20 equilateral triangular faces, fullerenes with icosahedral symmetry are only possible if they have 20(i2+ij+j2) vertices. As only fullenene graphs with 6m−4 vertices can correspond to nanoparticle graphs with N=15m chains (recall table 1), we obtain the condition

    i2+ij+j2=150(N−10).

    The two possible solutions N=60 and N=360 correspond to isomers with (i,j)=(1,0) or (i,j)=(0,1), and (i,j)=(2,1) or (i,j)=(1,2), respectively. In each case, we construct the planar net and apply the vertex colouring rule. In the first case, the nanoparticle graph has no hexagons and corresponds to the rhombic triacontahedron. In the second case, colouring compatible with icosahedral symmetry is indeed possible and results in two structures that are identical up to helicity (cf. table 2).
    What happens when a red blood cell dies?

    Figure 3. The Goldberg–Coxeter construction for particles with icosahedral symmetry. A superposition of an icosahedral net on a hexagonal tessellation determines the positions of hexagonal and pentagonal faces in a fullerene. The example shows the construction for a particle with (i,j)=(2,1), where O=(0,0), P=(2,1) and Q=(−1,3) with respect to the triangular lattice basis (e1,e2).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 2.Nanoparticles with non-planar symmetries. (Data on fullerenes in this table are excerpts from the Fowler–Cremona–Steer classification.)

    Nfullerenefullerene symmetry(i,j,h,k)nanoparticle symmetry
    60C20Ih(1,0,0,1)Ih
    120C44T(2,0,0,1)T
    180C68T(1,2,−1,1)
    Td(1,2,0,2)Td
    240C92T(2,1,0,2)T
    Th(2,1,1,2)Th
    Td(3,1,0,1)T,T (chiral)
    300C116T(1,2,−2,2)T, T
    Th(1,2,−1,3)T,T (chiral)
    T(4,0,0,1)T
    360C140I(1,2,−2,3)I
    T(2,3,−1,1)
    T(3,1,0,2)
    T(3,1,1,2)

    As before, we first derive symmetry restrictions on N. Denoting by td and pd the number of trimers and pentamers in generic position in the fundamental domain, the symmetry condition is

    N=3(4α+4β+12td)=5(12pd),

    where α, β in {0,1} indicate the absence or presence of trimeric clusters on the two types of threefold sites. Note that these correspond, respectively, to corners and centres of faces of a tetrahedron. The solutions specify the allowed chain numbers for particles with tetrahedral symmetry. Up to and including 360 chains, these are N=60 for (α,β,td,pd)=(1,1,1,1); N=120 for (0,1,3,2) and (1,0,3,2); N=180 for (0,0,5,3); N=240 for (1,1,6,4); N=300 for (0,1,8,5) and for (1,0,8,5); and N=360 for (0,0,10,6). By table 1, these correspond to fullerenes Cn with n=20,44,68,92,116,140. Note also that, because T is a subgroup of the tetrahedral groups Th and Td, the above restrictions hold also for nanoparticles with higher tetrahedral symmetry. Fullerenes with tetrahedral symmetry can be constructed via the Fowler–Cremona–Steer construction [23], which is based on the superposition of the surface of a polyhedron with tetrahedral symmetry onto a planar hexagonal tessellation as shown in figure 4. The polyhedral surface corresponds to the union of three types of triangles, equilateral and scalene, which are characterized via a quadruplet of integers (i,j,h,k) as follows: the four large equilateral triangles are given as in the Goldberg–Coxeter construction via (i,j), and the four small equilateral triangles by (h,k) (points P and Q in figure 4); the 12 scalene triangles are then implied by the dimensions and positions of these two triangle types. If all edges are of the same length and the angles between the large and small equilateral triangles are 60° (which is the case for h=−j and k=i+j), then this construction results in an icosahedral net as in figure 3. In general, the area of the polyhedral surface is 4(i2+ij+j2+h2+hk+k2+3(ik−jh)) times the area of a small equilateral triangle with vertices at (0,0) and (1,0). We thus obtain, using table 1, that the corresponding nanoparticle graph with N edges must satisfy the identity

    N=10+10(i2+ij+j2+h2+hk+k2+3(ik−jh)).

    We construct the planar nets for all tetrahedral solutions above, using the (i,j,h,k) vectors from the Fowler–Cremona–Steer classification (table 2), and check whether the vertex colouring procedure can be applied to obtain a nanoparticle graph. Note that the colouring is not always possible, and that there are cases in which there are different nanoparticles corresponding to the same fullerene. We list all resulting nanoparticle graphs with at least tetrahedral symmetry in table 2 and provide the corresponding atlas in figure 5. We give an explicit example of the full net of a tetrahedral fullerene graph and its associated nanoparticle graph (electronic supplementary material, figures S1 and S2).
    What happens when a red blood cell dies?

    Figure 4. (a) The Fowler–Cremona–Steer construction for fullerenes with tetrahedral symmetry. Here O=(0,0), P=(1,2), Q=(0,2) with respect to the triangular lattice basis, so that (i,j)=(1,2) and (h,k)=(0,2). The domain used here for the construction of the nanoparticles (corresponding in area to three times the fundamental domain) is shown highlighted; (b) a close up at this domain for the case of fullerene C68, with areas corresponding to portions of six of the 12 pentagons of the fullerene shown in grey; and (c) close-up of the domain in the corresponding nanoparticle graph (N=180), with trimers deleted according to the vertex colouring rule shown in red.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 5. Atlas of tetrahedral and icosahedral nanoparticles; the depicted domains are the union of three fundamental domains of the tetrahedral group (cf. figure 4).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The procedure introduced above allows one to construct nanoparticle graphs with arbitrary, or lack of, symmetry. In particular, as nanoparticle graphs with rhombic and hexagonal faces cannot have sixfold axes, neither sixfold rotational symmetry axes nor D6 symmetry are possible. By contrast, particles with D5 and D3 symmetry can occur.

    Particles with D5 symmetry must fulfil the necessary condition

    N=5(2α+10pd)=3(10td),

    where α=1 when the two sites of fivefold symmetry are each occupied by pentamers, and pd,td have the same meaning as before. Note that the exclusion of decagonal tiles implies α=1. There are only three possible solutions for chain numbers up to and including 360: N=60 (and C20) for (pd,td,m)=(1,2,4); N=210 (and C80) for (pd,td,m)=(4,7,14); and N=360 (and C140) for (pd,td,m)=(7,12,24). As before, models of fullerenes with D5 symmetry can be constructed by superimposing the general planar net of a polyhedron with such symmetry onto a hexagonal tessellation of the plane (cf. [23]). This again requires the specification of four integers (i,j,h,k), and corresponding values are listed in table 3.

    Table 3.Nanoparticles derived from fullerene graphs with D5 symmetry.

    Nfullerenefullerene symmetry(i,j,h,k)nanoparticle symmetry
    60C20Ih(1,−1,1,0)Ih
    210C80D5d(4,−7,1,0)
    D5d(3,−2,1,1)
    Ih(2,−2,2,0)
    D5h(1,−2,2,0)
    D5h(1,0,2,1)D5, D5
    360C140D5d(7,−13,1,0)
    D5d(6,−5,1,1)
    I(3,−1,1,2)I, D5
    D5(3,−5,2,0)
    D5(1,0,2,2)
    D5(1,0,3,1)D5, D5

    Note that the nanoparticle corresponding to C20 yields the classical icosahedral solution, while the isomer of C80 with coordinates (1,0,2,1) results in two different particles with D5 symmetry. Just two isomers of C140 yield solutions upon colouring (three of which have D5 symmetry and one I). All colourings generating nanoparticles with at least D5 symmetry are listed in table 3.

    Regarding D3 symmetry, the necessary condition is

    N=5(6pd)=3(6td+2α),

    where α∈{0,1} indicates the absence or presence of trimeric clusters on the particle threefold axes. Inspection of the Fowler–Cremona–Steer construction shows that the two sites of threefold symmetry must both be occupied by trimers, so that α=1. There are four solutions for particles up to 360 chains as follows: N=60 (C20) with (pd,td,m)=(2,3,4); N=150 (C56) with (pd,td,m)=(5,8,10); N=240 (C92) with (pd,td,m)=(8,13,16); and N=330 (C128) with (pd,td,m)=(11,18,22). The general planar net of a polyhedron with D3 symmetry can be represented as the union of four types of triangles, equilateral and scalene, which require the specification of six integers of the form (0,1,1,0,0,n′) with n′≥1 [23]. Values for nanoparticles corresponding to the D3 solutions are listed in table 4.

    Table 4.Nanoparticles derived from fullerene graphs with D3 symmetry.

    Nfullerenefullerene symmetry(0,1,1,0,0,n′)nanoparticle symmetry
    60C20Ih(0,1,1,0,0,1)Ih
    150C56D3d(0,1,1,0,0,7)D3, D3
    240C92D3d(0,1,1,0,0,13)4 solutions D3
    330C128D3d(0,1,1,0,0,19)8 solutions D3

    Nanoparticles with lower symmetry can also occur in all of these cases if the vertex colouring rule is applied in such a way to the associated fullerene graphs C6m−4 that its symmetry is reduced or broken. An example of this is provided in the electronic supplementary material, figure S3, showing all ways in which the symmetry of the icosahedral particle with 360 chains can be reduced. This demonstrates how the procedure developed here can be used to determine all lower symmetry alternatives for any of the higher symmetry particles listed in the previous subsection.

    These results pave the way for the optimization of SAPN morphologies for applications in vaccine design. To generate an optimal humoral immune response, repetitive antigen display is a key determinant [24–32]. SAPNs represent an ideal model for repetitive antigen display. They are similar to virus-like particles, but they have the advantage that they are more flexible in protein design, allowing testing of different architectures relatively easily. B-cell epitopes can be attached to either end of the protein chain and will thus be displayed close to the trimer and pentamer vertices of the particular SAPN architecture.

    The geometries as outlined above allow straightforward calculation of the distances between epitopes. This defines the epitope density, which in turn is related to the strength of the immune response. Already several decades ago, in their hallmark publication Dintzis et al. [33] related the epitope density to the so-called immunon, a determinant of the strength of the immune response. Based on our results, we can estimate the average distance between trimers and pentamers by a simple density argument. Given a nanoparticle graph N, with N=15m edges, consider the associated graph N′, in which each hexagon is replaced by three rhombs (figure 2), so that N′ has only rhombic faces, specifically 9m−6 (by table 1). Assuming that all rhombs are approximately equal, with approximately the same area, shape and sides, the area A of a spherical rhomb on the surface of a sphere of radius r can be estimated as

    A∼4πr29m−6.

    Given the area of the rhomb, and using spherical geometry we obtain table 5 for the average distances between trimers and between pentamers on a sphere of radius r.

    Table 5.The average distances between trimers and pentamers on a sphere of radius r.

    N=15maverage distance between trimersaverage distance between pentamers
    600.7136 r1.0514 r
    1200.5269 r0.7607 r
    1500.4763 r0.6834 r
    1800.4381 r0.6257 r
    2100.4078 r0.5805 r
    2400.3830 r0.5439 r
    3000.3446 r0.4875 r
    3300.3293 r0.4652 r
    3600.3159 r0.4457 r

    The epitopes can be on either end of the SAPN, i.e. on the trimer or on the pentamer. Identical epitopes will however always be on the same oligomerization domain. Computer modelling and experimental analysis have shown that the radius of the central cavity of the SAPNs, i.e. where the two coiled coils are joined together for a SAPN with N=60 is about 3 nm. The dimension of the central cavity will increase with the number of protein chains per particle. Also, the B-cell epitope will not be located on top of the vertices but rather roughly on top of the individual α-helical axes. The distance of this axis of the coiled coil α-helices relative to the trimer and pentamer axes is about 0.65 nm and 0.85 nm for the trimer and pentamer, respectively [34,35]. These two values have to be subtracted from the calculated distance between either two trimer or two pentamer vertices in table 5.

    If the B-cell epitope itself is a coiled-coil trimer as for example in the SARS [3] vaccines then we can calculate the distance between adjacent B-cell epitopes for a given length of a coiled coil. For instance, in the SARS nanoparticle with N=120 and a helix length of about 7 nm, the distance between epitopes located at trimeric sites would be about 4.6 nm. If the B-cell epitope itself is not coiled-coil, which has a quite extended shape, then the particular dimension of the B-cell epitope will also have to be taken into consideration. If it is a folded protein domain then it has quite likely a roughly spherical shape. The size of a protein like lysozyme is about 3.5 nm. Using a particular SAPN architecture the B-cell epitope can then be placed in an array with a rather precise spacing depending on the lengths of the coiled coil of the SAPN. This gives the vaccinologist a tool to optimize the vaccine for best immune response.

    The classification presented here provides, to our knowledge, the first complete atlas of SAPN geometries of D3 symmetry or higher, and provides a construction method for all particles, including low symmetry and asymmetric ones. We have demonstrated previously that a combinatorial analysis of SAPN structures can be an invaluable tool in the interpretation of experimental data. In particular, biophysical methods such as analytical ultracentrifugation can provide information on the numbers of chains N in the particles that occur in the self-assembly process. Combinatorics does then narrow down the spectrum of options to a limited ensemble of particle geometries compatible with this range of chain numbers, and identifies the precise surface structures of the particles in terms of the placements of all protein chains and threefold and fivefold coiled coils. It also offers a glimpse at the complexity of the assembly process in terms of the numbers of different particles that can occur in a given range of chain numbers. In previous work [15], a full classification had not yet been available. It was therefore only possible to identify possible candidates for the particles seen in experiment, but an exhaustive enumeration was not possible.

    The construction method with reference to fullerene architecture introduced here provides a step change. It offers for the first time, to our knowledge, insights into the full spectrum of particles of arbitrary size and morphology occurring in an experiment. This exhaustive approach therefore opens up opportunities for the analysis of experimental data that had not been possible before. For example, it is now possible to apply statistical mechanics approaches and construct partition functions describing the outcome of the assembly experiments. These can be used to better understand the assembly process itself in terms of the most likely, dominant assembly pathways. This, in turn, will provide pointers for experimentalists on how to optimize the assembly procedure, e.g. in terms of the yield of desired particle types. The detailed insights into the connectivity of each chain in the nanoparticle surface moreover enable computer reconstructions of the nanoparticles, as in the example in figure 1c. These can then be used to engineer specific architectures by controlling the rigidity of the links and the angle between the coiled coils (an issue not addressed here).

    Most importantly, however, the results obtained here enable the identification of SAPN morphologies that have not yet been synthesized, and thus enable the rational design of desired particle morphologies. In particular, our approach links SAPN morphologies with epitope positions, and therefore provides a tool for the identification of SAPN morphologies with optimal properties for vaccine design. However, if the SAPNs are co-assembled from different chains, i.e. if the SAPNs are composed of epitope-decorated units and protein chains lacking epitopes, then the assembly forms will be much more difficult to predict. Depending on the B-cell epitope, chains with epitope may cluster together if there are attracting forces between the B-cell epitopes. Also, we do not exclude the possibility that SAPNs may be formed that have an irregular assembly form of protein chains owing to imperfect propagation of the lattice in all directions. If so, this would lead to chimeric forms of SAPNs with respect to their architecture as described here.

    The paper is a theoretical study and is self-contained.

    G.I., P.B. and R.T. designed research; G.I. and R.T performed research; and G.I., P.B. and R.T. drafted the manuscript. All authors gave final approval for publication.

    P.B. is CEO of the company Alpha-O Peptides and has patents or patents pending on the technology. Alpha-O Peptides did not fund any of this research. The paper is a mathematical paper that refers to self-assembling protein nanoparticle (SAPN), a technology that has been patented by P.B.

    Funding for G.I.’s visit to York via Italian National Group of Mathematical Physics (GNFM-INdAM) and by EPSRC grant EP/K028286/1 is gratefully acknowledged. R.T. also thanks the Royal Society for a Royal Society Leverhulme Trust Senior Research Fellowship (LT130088).

    Part of this work has been carried out at the Mathematical Biosciences Institute (MBI) at Ohio University and R.T. and G.I. would like to thank the MBI for funding and hospitality.

    Footnotes

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3741236.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Kaba SAet al.2012Protective antibody and CD8+ T-cell responses to the Plasmodium falciparum circumsporozoite protein induced by a nanoparticle vaccine. PLoS ONE 7, e48304. (doi:10.1371/journal.pone.0048304) Crossref, PubMed, ISI, Google Scholar

    • 2

      Kaba SA, Brando C, Guo Q, Mittelholzer C, Raman SK, Tropel D, Aebi U, Burkhard P, Lanar DE. 2009A non-adjuvanted polypeptide nanoparticle vaccine confers long-lasting protection against rodent malaria. J. Immunol. 183, 7268–7277. (doi:10.4049/jimmunol.0901957) Crossref, PubMed, ISI, Google Scholar

    • 3

      Pimentel TA, Yan Z, Jeffers SA, Holmes KV, Hodges RS, Burkhard P. 2009Peptide nanoparticles as novel immunogens: design and analysis of a prototypic severe acute respiratory syndrome vaccine. Chem. Biol. Drug Des. 73, 53–61. (doi:10.1111/j.1747-0285.2008.00746.x) Crossref, PubMed, ISI, Google Scholar

    • 4

      Babapoor S, Neef T, Mittelholzer C, Girshick T, Garmendia A, Shang H, Khan MI, Burkhard P. 2011A novel vaccine using nanoparticle platform to present immunogenic M2e against avian influenza infection. Influenza Res. Treat 2011, 126794–126805. (doi:10.1155/2011/126794) PubMed, Google Scholar

    • 5

      Wahome N, Pfeiffer T, Ambiel I, Yang Y, Keppler OT, Bosch V, Burkhard P. 2012Conformation-specific display of 4E10 and 2F5 epitopes on self-assembling protein nanoparticles as a potential HIV vaccine. Chem. Biol. Drug Des. 80, 349–357. (doi:10.1111/j.1747-0285.2012.01423.x) Crossref, PubMed, ISI, Google Scholar

    • 6

      El-Bissati K, Zhou Y, Dasgupta D, Cobb D, Dubey JP, Burkhard P, Lanar DE, Mcleod R. 2014Effectiveness of a novel immunogenic nanoparticle platform for Toxoplasma peptide vaccine in HLA transgenic mice. Vaccine 32, 3243–3248. (doi:10.1016/j.vaccine.2014.03.092) Crossref, PubMed, ISI, Google Scholar

    • 7

      López-Segaseta J, Malito E, Rappuoli R, Bottomley MJ. 2016Self-assembling protein nanoparticles in the design of vaccines. Comput. Struct. Biotechnol. J. 14, 58–68. (doi:10.1016/j.csbj.2015.11.001) Crossref, PubMed, ISI, Google Scholar

    • 8

      Powles L, Xiang SD, Selomulya C, Plebanski M. 2015The use of synthetic carriers in malaria vaccine design. Vaccines 3, 894–929. (doi:10.3390/vaccines3040894) Crossref, ISI, Google Scholar

    • 9

      Sciore Aet al.2016Flexible, symmetry-directed approach to assembling protein cages. Proc. Natl Acad. Sci. USA 113, 8681–8686. (doi:10.1073/pnas.1606013113) Crossref, PubMed, ISI, Google Scholar

    • 10

      Fletcher JMet al.2013Self-assembling cages from coiled-coil peptide modules. Science 340, 595–599. (doi:10.1126/science.1233936) Crossref, PubMed, ISI, Google Scholar

    • 11

      King NP, Sheffler W, Sawaya MR, Vollmar BS, Sumida JP, André I, Gonen T, Yeates TO, Baker D. 2012Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 336, 1171–1174. (doi:10.1126/science.1219364) Crossref, PubMed, ISI, Google Scholar

    • 12

      Padilla JE, Colovos C, Yeates TO. 2001Nanohedra: using symmetry to design self assembling protein cages, layers, crystals, and filaments. Proc. Natl Acad Sci. USA 98, 2217–2221. (doi:10.1073/pnas.041614998) Crossref, PubMed, ISI, Google Scholar

    • 13

      Raman SK, Machaidze G, Lustig A, Aebi U, Burkhard P. 2006Structure-based design of peptides that self-assemble into regular polyhedral nanoparticles. Nanomedicine 2, 95–102. (doi:10.1016/j.nano.2006.04.007) Crossref, PubMed, ISI, Google Scholar

    • 14

      Yang Y, Ringler P, Müller SA, Burkhard P. 2012Optimizing the refolding conditions of self-assembling polypeptide nanoparticles that serve as repetitive antigen display systems. J. Struct. Biol. 177, 168–176. (doi:10.1016/j.jsb.2011.11.011) Crossref, PubMed, ISI, Google Scholar

    • 15

      Indelicato G, Wahome N, Ringler P, Müller SA, Nieh MP, Burkhard P, Twarock R. 2016Principles governing the self-assembly of coiled-coil protein nanoparticles. Biophys. J. 110, 646–660. (doi:10.1016/j.bpj.2015.10.057) Crossref, PubMed, ISI, Google Scholar

    • 16

      Raman S, Machaidze G, Lustig A, Olivieri A, Aebi U, Burkhard P. 2009Design of peptide nanoparticles using simple protein oligomerization domains. Open Nanomed. J. 2, 15–26. (doi:10.2174/1875933500902010015) Crossref, Google Scholar

    • 17

      Caspar DL, Klug A. 1962Physical principles in the construction of regular viruses. Cold Spring Harb. Symp. Quant. Biol. 27, 1–24. (doi:10.1101/SQB.1962.027.001.005) Crossref, PubMed, Google Scholar

    • 18

      Schwerdtfeger P, Wirz L, Avery J. 2013Program Fullerene: a software package for constructing and analyzing structures of regular fullerenes, version 4.4. J. Comput. Chem. 34, 1508–1526. (doi:10.1002/jcc.23278) Crossref, PubMed, ISI, Google Scholar

    • 19

      Fowler PW, Manolopoulos DE. 2006An atlas of fullerenes. New York, NY: Dover. Google Scholar

    • 20

      Schwerdtfeger P, Wirz L, Avery J. 2015The topology of fullerenes. WIREs Comput. Mol. Sci. 5, 96–145. (doi:10.1002/wcms.1207) Crossref, Google Scholar

    • 21

      Goldberg M. 1937A class of multi-symmetric polyhedra. Tohoku Math. J. 43, 104–108. Google Scholar

    • 22

      Coxeter HSM. 1971Virus macromolecules and geodesic domes. In A spectrum of mathematics (ed. JC Butcher), pp. 279–303. Oxford, UK: Oxford University Press. Google Scholar

    • 23

      Fowler PW, Cremona JE, Steer JI. 1988Systematics of bonding in non-icosahedral carbon clusters. Theor. Chim. Acta 73, 1–26. (doi:10.1007/BF00526647) Crossref, Google Scholar

    • 24

      Fehr T, Bachmann MF, Bucher E, Kalinke U, Di Padova FE, Lang AB, Hengartner H, Zinkernagel AM. 1997Role of repetitive antigen patterns for induction of antibodies against antibodies. J. Exp. Med. 185, 1785–1792. (doi:10.1084/jem.185.10.1785) Crossref, PubMed, ISI, Google Scholar

    • 25

      Bachmann MF, Kalinke U, Althage A, Freer G, Burkhart C, Roost H, Aguet M, Hengartner H, Zinkernagel RM. 1997The role of antibody concentration and avidity in antiviral protection. Science 276, 2024–2027. (doi:10.1126/science.276.5321.2024) Crossref, PubMed, ISI, Google Scholar

    • 26

      Baschong W, Hasler L, Häner M, Kistler J, Aebi U. 2003Repetitive versus monomeric antigen presentation: direct visualization of antibody affinity and specificity. J. Struct. Biol. 143, 258–262. (doi:10.1016/j.jsb.2003.08.004) Crossref, PubMed, ISI, Google Scholar

    • 27

      Bachmann MF, Jennings GT. 2010Vaccine delivery: a matter of size, geometry, kinetics and molecular patterns. Nat. Rev. Immunol. 10, 787–796. (doi:10.1038/nri2868) Crossref, PubMed, ISI, Google Scholar

    • 28

      Yassine HMet al.2015Hemagglutinin-stem nanoparticles generate heterosubtypic influenza protection. Nat. Med. 21, 1065–1070. (doi:10.1038/nm.3927) Crossref, PubMed, ISI, Google Scholar

    • 29

      Kanekiyo Met al.2013Self-assembling influenza nanoparticle vaccines elicit broadly neutralizing H1N1 antibodies. Nature 499, 102–106. (doi:10.1038/nature12202) Crossref, PubMed, ISI, Google Scholar

    • 30

      Jardine Jet al.2013Rational HIV immunogen design to target specific germline B cell receptors. Science 340, 711–716. (doi:10.1126/science.1234150) Crossref, PubMed, ISI, Google Scholar

    • 31

      Correia BEet al.2014Proof of principle for epitope-focused vaccine design. Nature 507, 201–206. (doi:10.1038/nature12966) Crossref, PubMed, ISI, Google Scholar

    • 32

      Brune KD, Leneghan DB, Brian IJ, Ishizuka AS, Bachmann MF, Draper SJ, Biswas S, Howarth M. 2016Plug-and-display: decoration of virus-like particles via isopeptide bonds for modular immunization. Sci. Rep. 6, 19234. (doi:10.1038/srep19234) Crossref, PubMed, ISI, Google Scholar

    • 33

      Dintzis HM, Dintzis RZ, Vogelstein B. 1976Molecular determinants of immunogenicity: the immunon model of immune response. Proc. Natl Acad. Sci. USA 73, 3671–3675. (doi:10.1073/pnas.73.10.3671) Crossref, PubMed, ISI, Google Scholar

    • 34

      Strelkov SV, Burkhard P. 2002Analysis of alpha-helical coiled coils with the program TWISTER reveals a structural mechanism for stutter compensation. J. Struct. Biol. 137, 54–64. (doi:10.1006/jsbi.2002.4454) Crossref, PubMed, ISI, Google Scholar

    • 35

      Malashkevich VN, Kammerer RA, Efimov VP, Schulthess T, Engel J. 1996The crystal structure of a five-stranded coiled coil in COMP: a prototype ion channel?Science 274, 761–765. (doi:10.1126/science.274.5288.761) Crossref, PubMed, ISI, Google Scholar


    Page 8

    Accurate throwing is a skilled motor task in humans that has inspired many studies of motor control [1–6]. However, robust and commonplace observations such as the trade-off between speed and accuracy [5,7–11], or the choice of overarm versus underarm styles depending on the target, remain unexplained. It is possible that these features are consequences of the underlying biological complexity associated with planning and execution. For example, the intensity of noise in muscles depends on their force output [6,11,12] thereby leading to a speed–accuracy trade-off, and the choice of throwing style may simply be idiosyncratic.

    However, a complementary perspective on the problem is that the physical dynamics of projectile flight map the variability in initial conditions to that of the landing location. This approach to error propagation and amplification in dynamical systems has antecedents that go back more than a century [13], but continue to have implications for prediction and decision-making. The specific example of throwing is particularly interesting because it decouples the internal (neural/cognitive) decision from the dynamics of a projectile, separating planning from control. Once the projectile has been launched, there is no possibility of control, differentiating this task from much better studied tasks such as pointing and tracking [14]. Instead, one has to learn strategies from an iterative process of error estimation and correction from one trial to the next. This has led to studies of error propagation via the throwing actions used in specific sports, such as basketball [4,15,16], darts [3] or pétanque [1], where the analyses treat throwing as a problem of shooting, i.e. the arm plays no role. Here, we complement these by studying optimal strategies in throwing using a simple model of the arm as a finite object, and an analysis of error propagation through the dynamics of an ideal projectile flight. This allows us to address the qualitative selection of overarm versus underarm styles, as well as the quantitative selection of the release angle and speed, and their dependence on the target geometry, location and throwing speed. We then use our results to analyse some experiments in the context of games that involve throwing, and also consider the role of structured noise in the release parameters to determine how this plays out in determining optimal strategies. We finally look at the role of planning uncertainty in characterizing how errors are amplified, and what this implies for a measured approach to learning the optimal strategy for throwing.

    In a minimal setting, we model the arm as a rigid hinged bar that releases a drag-free point projectile at any desired angle ϕ and angular speed ω. We note that the choice of an angle and angular velocity as the primary variables is deliberate; prescribing an alternate set of variables on a cartesian system are not completely equivalent. This is because the choice of basic variables play an important role in determining how errors are amplified via the Jacobian of the transformation from one basis set to another. Our choice is ego-centric, centred about the thrower using a natural angular set of variables; other choices and how they affect error propagation will yield different quantitative results. However, we show how behaviours like the speed–accuracy trade-off are robust to the choice of release parameters.

    The variability and correlations in the release parameters (ϕ,Ω) depend on the detailed properties of the neuromuscular system. Here, we assume that the noise in ϕ and Ω are uncorrelated; in the context of a linearized analysis, we need no further assumptions about the specific distributions associated with the noise. We assume that the goal of the arm is to throw the projectile into a bin, modelled as a target that presents an up-facing horizontal surface with its centre at a distance ℓ and height h away from the base of the arm. Lengths are expressed in units of the arm length and accelerations in units of Earth’s gravitational acceleration. The dynamics for the position of the projectile x(t),y(t) are:

    x¨=0,y¨=−1,x(0)=cos⁡ϕ,x˙(0)=ωsin⁡ϕandy(0)=sin⁡ϕ,y˙(0)=−ωcos⁡ϕ.}2.1

    There is a one parameter family of throwing strategies, found by solving for three unknowns (t,ϕ,ω) using the two equations (2.1), which satisfy the relations for the projectile to exactly strike the target at ℓ,h:

    x(t,ϕ,ω)=cos⁡ϕ−tωsin⁡ϕ=ℓ2.2a

    and

    y(t,ϕ,ω)=sin⁡ϕ+tωcos⁡ϕ−12t2=h.2.2b

    To enter the bin, the projectile must move downwards at the time th corresponding to the projectile reaching the height h, i.e. y˙(th)=ωcos⁡ϕ−th≤0. Using this condition, and solving (2.2b), the time of impact is given by th(ϕ,ω)=ωcos⁡ϕ+ω2cos2⁡ϕ−2(h−sin⁡ϕ). The horizontal position xh of the projectile as it strikes the plane of the target is then given by

    xh(ϕ,ω)=cos⁡ϕ−ωsin⁡ϕ(ωcos⁡ϕ+ω2cos2⁡ϕ−2(h−sin⁡ϕ)).2.3

    In figure 1a, we show two such choices corresponding to the solid red and green trajectories for throws into a horizontal bin with its centre located at ℓ=1.5, h=−1.5, i.e. in front of and below the shoulder. Naturally, there are some target positions and arm postures which are disallowed, and characterized by the requirements that ω2≥2(h−sin⁡ϕ)/cos2⁡ϕ and h≥sin⁡ϕ (see the electronic supplementary material for details). Setting xh(ϕ,ω)=ℓ, we obtain the one-parameter continuum of strategies for exactly striking the centre of the target, as given by the speed–angle relationship,

    ω0(ϕ)=cos⁡ϕ−ℓsin⁡ϕ2sin⁡ϕ(1−ℓcos⁡ϕ−hsin⁡ϕ),2.4

    Figure 1b shows the solution curve ω0(ϕ) for the target shown in figure 1a. In general, there will be errors in the final position, which we define to be δxh=xh−ℓ, the difference in the distance traversed xh and the target distance ℓ when the projectile reaches the height h. To choose from this continuum of possible trajectories, we need a criterion. Noting that there will always be errors in the initial condition, we suggest that the best strategy from this one parameter family of solutions is one that is maximally tolerant of these initial errors.
    What happens when a red blood cell dies?

    Figure 1. Error propagation in throwing depends on the trajectory of the projectile. Underarm=ω0(ϕ)>0; overarm=ω0(ϕ)<0. (a) For a given target, there are generally multiple ways to strike it exactly; the solid red and green trajectories are two examples using an overarm and underarm style, respectively. The lightly shaded bands shows how uniformly distributed errors in position and velocity propagate; the overarm throw is more accurate. (b) The curve of launch parameters (ϕ,ω0(ϕ)) given by equation (2.4) that exactly strike the target. (c) Deviations away from this curve leads to an error (δxh)2 as a function of ϕ and ω. Error amplification is quantified by the maximal curvature of the valley of this surface.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    To quantify the amplification of small launch errors, we linearize xh(ϕ,ω) in the neighbourhood of the curve (ϕ,ω0(ϕ)) to obtain the relationship between the ‘input error vector’ ϵ=(δϕδω) and the ‘output/target error’ δxh given by δxh≈Jerr(ϕ)⋅ϵ, where

    Jerr(ϕ)=(∂xh∂ϕ∂xh∂ω)|ϕ,ω0(ϕ).3.1

    In fact, as there is a one-dimensional curve of solutions (ϕ,ω0(ϕ)), where δxh=0, Jerr will be rank deficient, i.e. it has a zero singular value and an associated non-trivial null-space, namely, the tangent to the curve (ϕ,ω0(ϕ)), while the other singular value is λ(ϕ). In figure 1a, we show that the amplification of errors in the release angle δϕ and speed δω exemplified by the lightly shaded bands depends upon the trajectory of the projectile; the overarm throw is clearly the better choice here. In figure 1c, we show how the squared-error δxh2(ϕ,ω) varies as a function of the uncertainty in the release parameters. The minimum (valley) is simply the solution curve ω0(ϕ), and the maximum curvature of the surface orthogonal to this valley is a measure of how δxh2 grows for small launch errors δϕ and δω. It is easy to see that the curvature of the error surface is 2λ2, where λ is the non-zero singular value of the Jacobian that maps the initial conditions to the final state. Errors in ϕ and ω that are tangent to the solution curve ω0(ϕ) cancel each other by virtue of belonging to the null space of the error propagation map Jerr, otherwise they are amplified in proportion to λ. Thus, the reciprocal of λ is a natural measure of accuracy, henceforth denoted by p, i.e. p=1/λ.

    Accuracy p is parametrized either using the launch speed ω by considering the neighbourhood of the function ω0−1(ω) (inverse of (2.4)), or using the launch angle ϕ by considering the neighbourhood of the solution curve ω0(ϕ). In figure 2a, we show four curves that quantify accuracy for two given targets; for each, there are two possible overarm throws (shallow or high) and similarly for underarm throws. For each of the example targets, faster throws lead to a sharp decay in accuracy, and the overarm throw is as good or better than the underarm throw at high speeds. In figure 2b, we show polar plots of the accuracy (radial distance) as a function of launch angle (polar angle). For the target below the shoulder with h=−1.5, the overarm throw is more accurate than underarm. The converse is true for the second example of a target above the shoulder with h=1.5. However, for the second target, the superiority of the underarm throw is extremely sensitive to any uncertainty in planning because of its very sharply peaked shape. Planning uncertainties would manifest as an incorrect choice of optimal release parameters, and the underarm strategy would be sensitive to these planning errors.

    What happens when a red blood cell dies?

    Figure 2. Error amplification for throwing depends on the target location and planning errors. (a) For two different targets, one below and another above the shoulder, a comparison between all possible overarm (solid red) and underarm (dashed green) throws. Accuracy is quantified by the inverse of the non-zero singular value, i.e. p=1/λ of the map Jerr given by equation (3.1) as a function of the release velocity ω. (b) Polar plots of accuracy p(ϕ)=1/λ(ϕ) as a function of arm angle at release ϕ for the same two different targets. We see that even if p(,ωm,ϕm) is a maximum, it can fall off quickly, so that such a strategy is susceptible to small planning errors, i.e. inaccuracies in the internal model of the dynamics.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Speed–accuracy trade-offs in biological systems are usually explained as the result of structured noise in the system dynamics [11,17] in the form of corrective submovements, velocity-dependent noise, activation-dependent noise or more generally ‘signal-dependent’ and structured covariance of input noise at the level of muscles [10,12,18]. Given that our physical picture for throwing introduces noise in a simple setting, we ask what the consequences are for speed–accuracy trade-offs.

    In figure 3, we show that slower throws are typically more accurate than faster ones—the classic trade-off between speed and accuracy that is observed in multiple contexts of human motor behaviour [7–10,12,18,19]. The most accurate throw is typically slightly faster than that associated with the minimum speed ωmin(ℓ,h) needed to reach a given target. At higher speeds, the shallow overarm throw is most accurate, particularly for targets at or below the arm pivot (equivalent to the shoulder). Therefore, the physics of projectile flight dictates that throwing slowly generally maximizes accuracy, and if it becomes necessary to throw fast, an overarm style is the more accurate one. The speed–accuracy trade-off is also seen for other target geometries such as a vertically oriented target, as shown in the electronic supplementary material.

    What happens when a red blood cell dies?

    Figure 3. Speed–accuracy trade-off for various locations of a horizontal target. The location of each set of curves corresponds to the target location. The target locations are specified by the angle to the target θtarget and distance l2+h2. For each target location and choice of throwing speed (absolute value), there are generally four distinct release angles: either overarm or underarm, and for each style, either a shallow (closest to straight-line path) or high throw. Every curve shows a decrease in accuracy for higher speeds. For most target locations, our model predicts that the most accurate throw is at speeds slightly higher than ωmin the smallest possible speed to reach the target. Of all curves at each target location, a shallow overarm throw is typically more accurate than the other three styles at all speeds. All calculations restricted ω0∈[ωmin,3.54]; for a 1 m arm length, this corresponds to a maximum launch speed of ≈11.1 m s−1≈40 km h−1.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    To understand this, we note that irrespective of the parameters being controlled by the throwing arm, the strategy for striking the centre of the target is specified by the curve ω0(ϕ) which has the same qualitative shape irrespective of the target’s location relative to the arm, below, at level or above, as seen from figure 4a. The thickened curves show ω0, and the inset shows the emerging speed–accuracy trade-off, for a target that is three arm-lengths away, and an angle θtarget=−π/3 below the horizontal. Accuracy is given by

    p(ω)=1λ(ω)4.1

    and

    λ2=λϕ2+λω2,4.2

    where the squared overall error λ2 is the sum of the squared errors λϕ2 and λω2, owing to variations in release angle and in release speed, respectively. Consider ω0(ϕ) for the overarm throw, as shown in figure 4b. At the slowest possible speed ωmin, the landing location is insensitive to small fluctuations δϕ in the release angle because the curve is parallel to the ϕ axis, and λϕ2(ωmin)=0. At the other extreme of ω→∞, speed fluctuations δω are parallel to the ω axis, and therefore λω2=0. There are two limiting throwing angles where ω→∞, such that the flight time th→∞ for the release corresponding to the high curved throw, and th→0 for the straight shot. With infinite flight time, λϕ2→∞, and therefore p→0 as seen for the dotted lines in the inset of figure 4a. For the straight shot towards the target (inset of figure 4c), λϕ2=0 at the minimum speed and λϕ2→λ∞2 as ω→∞. On the other hand, λω2=λωmin2 at the minimum speed and λω2→0 as ω→∞. The sum of these two has a local minimum, implying that there is a certain throwing speed that maximizes accuracy, and faster throws are always less accurate. The presence of a minimum for any linear weighted sum αλϕ2+βλω2 depends only on the relative rate of decay of λω2 being higher than the rate of growth of λϕ2. In particular, it has no dependence on the weights α and β so long as α,β≠0. In other words, the relative noise level in the angle and speed has no impact on the speed–accuracy trade-off, because λω2 decays faster than the growth in λϕ2 (figure 4d) as given by (see the electronic supplementary material for derivation)

    1−λϕ2(ω)λϕ2(∞)∼ω−24.3

    and

    λω2(ω)∼ω−6.4.4

    The exponents are independent of the target’s location, and only depend on the nature of projectile flight under uniform gravity, as captured by the curve ω0(ϕ).
    What happens when a red blood cell dies?

    Figure 4. The speed–accuracy trade-off emerges independent of the target location, and the relative noise level in release angle versus speed. (a) Speed versus angle plots for targets at various positions around the arm, and three arm lengths away from the pivot. The remaining panels demonstrate the principles governing the speed–accuracy trade-off using the target located at an angle of −π/3 relative to the horizontal. (b) At the slowest possible throw, the curve ω0(ϕ) is parallel to the ϕ axis, and the landing location is therefore insensitive to small variations in the release angle. At the other extreme, when the launch speed is infinitely large, throwing error is insensitive to variations in speed. Of the two limiting throws with infinite speed, the one with an infinite flight time will infinitely amplify even the smallest angle error. (c) The total squared error λ2=λϕ2+λω2 is the result of a competition between a power-law decay in λω2 versus a power-law rise in λϕ2. The competition always leads to a local minimum in the total error. (d) The decay and rise have different exponents, with the decay of λω2 always being faster than the rise in λϕ2. Therefore, the local minimum in the total error exists for any linear weighted sum of the individual errors λϕ2 and λω2.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Our minimal model based on the error amplification properties of parabolic projectile flight exhibits the experimentally well-known trade-off between speed and accuracy in throwing [7]. Furthermore, it naturally justifies the qualitative observation that the most precise throw is slightly faster than the minimum throwing speed for hitting the target [5], independent of target geometry. This result emphasizes the importance of the physical task in characterizing speed–accuracy trade-offs, which are not likely to be just intrinsic properties of the motor system. Consistent with this, virtual reality throwing experiments in a non-uniform gravitational field by Sternad et al. [11] show that speed and accuracy do not always trade-off against each other, and whether they do depends on the task.

    Our theory suggests a plausible mechanism for optimal strategies, and makes some simple predictions. We now follow some of its implications in the context of games that involve throwing, characterize the role of structured noise in the release parameters to determine how this plays out in determining optimal strategies, and conclude with some thoughts on learning the optimal strategy for throwing.

    Dart throwing is a game that requires the accurate release of projectiles with a simple metric of performance that is easily quantified. Data on dart throwers from [3], show that the dart is released by human throwers at a speed of 5.8–6.7 m s−1 about 4–25 ms before the peak of a circular motion of the hand with a radius of 0.5–0.7 m. In the context of our model, we choose either the forearm or the whole arm as the natural length scale (see the electronic supplementary material, Scaling of experimental data). For a vertical target at the prescribed distance of 2.37 m in front of the thrower and 1.73 m above the ground, we calculate the optimal strategy ϕoptim and ωoptim=ω0(ϕoptim) that maximizes accuracy p(ϕ). The optimal dart throwing strategy is an overarm throw with an optimal release angle of 17–37° before the arm becomes vertical, and a corresponding optimal speed of 5.1–5.5 m s−1. At this speed and release angle, the dart would be released 44–35 ms before the hand reaches the zenith. The best overarm throw is 7–20% more accurate than the best underarm throw, as found from the ratio of the accuracies p(ϕoptim). The overarm throwing strategy with a larger radius of curvature (0.8 m) and higher speed (5.5 m s−1) is the most accurate of all, consistent with observations. Our predictions are only weakly dependent on the choice of the length scale (see the electronic supplementary material), but are strongly dependent on the target geometry (see the electronic supplementary material). Similar calculations for basketball free throws, another sport with an emphasis on accuracy, also recover strategies that are consistent with observations (see the electronic supplementary material).

    We have so far assumed no covariance structure for the noise in the release angle ϕ and speed ω. In the linearized analysis, this is equivalent to having a uniform (circular) distribution of the errors in these variables with a variance (radius) c that is amplified through the projectile dynamics. Covariance structure to the input noise is manifest as an elliptic distribution of errors, say with semi-major axis a and semi-minor axis b. In order to compare two cases with equal noise, we constrain the area of the circle and ellipse to be the same, i.e. ab=c2. These distributions of initial conditions are propagated by the projectile dynamics to lead to an error ecir or eell, respectively. For small input noise, the radius of the input noise is amplified linearly by the singular value λ of the linearized projectile map Jerr (figure 1c). Furthermore, using the fact that the smallest possible input error for the elliptic initial distribution, by definition of the semi-minor axis of the ellipse, is b, and c2=ab, we obtain e2cir=λ2c2=λ2ab, so that e2ell≥λ2b2. Therefore,

    eellecir≥ba.5.1

    Even for strong covariance, i.e. b/a∼O(ϵ)≪1, the reduction in noise amplification is at best the square-root of the eccentricity of the covariance. Furthermore, to achieve this limit of noise reduction through covariation, the two release parameters have to covary exactly and compensate for each other’s amplification by the projectile dynamics, i.e. the major axis of the covariance ellipse has to exactly align with the null space of Jerr. If the major axis was misaligned by an angle θ, eell/ecir=b/acos⁡θ+a/bsin⁡θ. With strong covariance b/a∼O(ϵ) and a small misalignment of the covariance ellipse θ∼O(γ), γ≪1, we have eell/ecir=ϵ+γ. Therefore, the effect of a strong covariance is diminished, and the effect of a small misalignment is amplified. Structured noise, such as typical of human motor control [6,11,18,20], undoubtedly reduces the impact of noise on performance. However, we have shown here that for throwing, and in fact for any motor task where error amplification depends smoothly on the input parameters, using noise covariance to mitigate errors is a fragile strategy.

    We conclude with a brief discussion of the role of uncertainty in planning before throwing, i.e. characterizing potential inaccuracies in the internal model of the projectile’s dynamics, characterized by a ‘planning uncertainty distribution’ τ(ϕ). This distribution models the uncertainty in the optimal strategy, which in turn is an outcome of uncertainty in the internal model of the projectile’s dynamics. The effective accuracy, following the definition of conditional probabilities, is given by

    Eover=∫ω0(ϕ)<0p(ϕ | τ)τ(ϕ) dϕandEunder=∫ω0(ϕ)≥0p(ϕ | τ)τ(ϕ) dϕ.5.2

    When the internal model is uncertain we predict the optimal fraction of overarm and underarm throws as given by fover=Eover/(Eover+Eunder),funder=1−fover, a consequence of the fact that E(•) is itself a random variable owing to uncertainty in the underlying model of the dynamics of the projectile, so that a minimal assumption is that f(•) be proportional to E(•), i.e. p(ϕ).

    One natural limit of the planning probability distribution τ(ϕ) corresponds to perfect planning for an expert with zero uncertainty in the optimal overarm and underarm throw, i.e. τ(ϕ)=δ(ϕ−ϕoveroptim)+δ(ϕ−ϕunderoptim) so that E(∙)=maxp(∙)(ϕ). The other limit is uniform planning with large uncertainty for a novice, i.e. τ(ϕ)=1 so that E(∙)=∫(∙)p(ϕ) dϕ.

    In figure 5a, we see that for no planning errors, an overarm throw is preferred for targets above the shoulder, but an underarm throw is preferred for targets below the shoulder. In figure 5b, we see that for large planning errors, an overarm throw is preferred for most target locations. Comparing this with figure 2a, we see that for a target below the shoulder, an overarm strategy is better, and also more forgiving of planning errors in ϕ. By contrast, comparing this with figure 2b, we see that this is consistent with the fact that accuracy p is strongly peaked for underarm throws towards a target above the shoulder but falls off rapidly, so that the underarm throw is optimal for small planning errors, but not robust to large planning errors. These predictions are also consistent with previous observations [2] that the preference of overarm versus underarm depends on the distance to the target (see the electronic supplementary material).

    What happens when a red blood cell dies?

    Figure 5. Predicted and measured fraction of overarm throws for a planar, horizontal target (like a bin). Colours correspond to expected fraction of overarm throws as a function of the height and distance of the target from the arm (of length unity). With zero planning errors (a), an underarm throw is strongly preferred for targets just outside arm length above the shoulder, while an overarm style is preferred for targets below the shoulder. By contrast, when planning errors are large (b), there is a strong preference for the overarm style almost independent of target location.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The ability to throw fast and accurately is quintessentially human, and a seemingly complex task. Here, we have focused on the simplest physical problem of how errors in the release parameters are amplified by the parabolic trajectory of a thrown projectile to determine optimal strategies for throwing. Although throwing is a complicated motor task, the predictions of our model for overarm versus underarm throwing styles are consistent with extant experimental data that show a dependence of style on the target location as well as on planning uncertainty. Despite the absence of neural or physiological elements, our minimal model is consistent with a range of independent experiments on throwing: the speed–accuracy trade-off that is commonly observed in throwing [7], the optimal strategy (launch angle and speed) for throwing darts [3], the preference of overarm versus underarm throwing based on target location [2] and even the qualitative claim that the most precise throw is slightly faster than the minimum throwing speed for hitting the target [5]. Our work suggests that strategy and trade-offs are intimately related to the motor tasks that involve interactions with the environment, and are not just intrinsic properties of the neuromotor system. This should hardly be a surprise, since the system evolved and developed in a physical environment, but is a point worth emphasizing since it is all too often forgotten. We must look for the physical origins of the speed–accuracy trade-off that are central to the observed trade-offs in humans [5,7,9]. Given that throwing might have played a substantial role in our evolutionary past, some of our results on the trade-off between speed and precision, especially its dependence on the throwing style, may come to bear on the topic of human evolution.

    Accurate throwing could serve as a testbed for understanding motor learning without the added difficulties of continuous feedback control present in tasks such as arm-pointing [14]. Indeed, our study naturally points to iterative approaches for learning that first execute a plan, observe errors or performance of the output, and use that to build an internal model. In a Bayesian framework, the planning distribution τ(ϕ) is the prior, p(ϕ | τ)τ(ϕ) is the observed posterior and the motor learning algorithm corresponds to the inference of the true model p(ϕ), i.e. an experiential understanding of Newtonian mechanics from repeated observations. Because the thrower’s performance depends on both p and τ, the thrower can employ τ as a probe to learn the dynamics of the task, i.e. p. Whether this is how we actually do learn about the physical world remains a question for the future.

    All detailed derivations are included in the electronic supplementary material. No other data were generated in this research.

    L.M. conceived of the study; M.V. did the calculations and both authors formulated the problem and wrote the paper.

    We have no competing interests.

    Funding support from the Wellcome Trust/DBT India Alliance (grant no. 500158/Z/09/Z), Human Frontier Science Program, the Wyss Institute for Bioinspired Engineering and the MacArthur Foundation.

    Footnotes

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3738164.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Dupuy MA, Motte D, Ripoll H. 2000The regulation of release parameters in underarm precision throwing. J. Sports Sci. 18, 375–382. (doi:10.1080/02640410050074304) Crossref, PubMed, ISI, Google Scholar

    • 2

      Westergaard GC, Liv C, Haynie MK, Suomi SJ. 2000A comparative study of aimed throwing by monkeys and humans. Neuropsychologia 38, 1511–1517. (doi:10.1016/S0028-3932(00)00056-7) Crossref, PubMed, ISI, Google Scholar

    • 3

      Smeets JB, Frens MA, Brenner E. 2002Throwing darts: timing is not the limiting factor. Exp. Brain Res. 144, 268–274. (doi:10.1007/s00221-002-1072-2) Crossref, PubMed, ISI, Google Scholar

    • 4

      Gablonsky JM, Lang A. 2005Modeling basketball free throws. SIAM Rev. 47, 775–798. (doi:10.1137/S0036144598339555) Crossref, ISI, Google Scholar

    • 5

      Freeston J, Ferdinands R, Rooney K. 2007Throwing velocity and accuracy in elite and sub-elite cricket players: a descriptive study. Eur. J. Sport Sci. 7, 231–237. (doi:10.1080/17461390701733793) Crossref, ISI, Google Scholar

    • 6

      Cohen RG, Sternad D. 2009Variability in motor learning: relocating, channeling and reducing noise. Exp. Brain Res. 193, 69–83. (doi:10.1007/s00221-008-1596-1) Crossref, PubMed, ISI, Google Scholar

    • 7

      Kerr BA, Langolf GD. 1977Speed of aiming movements. Q. J. Exp. Psychol. 29, 475–481. (doi:10.1080/14640747708400623) Crossref, Google Scholar

    • 8

      Meyer DE, Keith-Smith JE, Kornblum S, Abrams RA, Wright CE. 1990Speed–accuracy tradeoffs in aimed movements: toward a theory of rapid voluntary action. In Attention and Performance XIII: motor representation and control (ed. M Jeannerod), pp. 173–226. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Google Scholar

    • 9

      Etnyre BR. 1998Accuracy characteristics of throwing as a result of maximum force effort. Percept Mot. Skills 86, 1211–1217. (doi:10.2466/pms.1998.86.3c.1211) Crossref, PubMed, ISI, Google Scholar

    • 10

      Harris CM, Wolpert DM. 2006The main sequence of saccades optimizes speed–accuracy trade-off. Biol. Cybern. 95, 21–29. (doi:10.1007/s00422-006-0064-x) Crossref, PubMed, ISI, Google Scholar

    • 11

      Sternad D, Abe MO, Hu X, Müller H. 2011Neuromotor noise, error tolerance and velocity-dependent costs in skilled performance. PLoS Comput. Biol. 7, e1002159. (doi:10.1371/journal.pcbi.1002159) Crossref, PubMed, ISI, Google Scholar

    • 12

      Harris CM, Wolpert DM. 1998Signal-dependent noise determines motor planning. Nature 394, 780–784. (doi:10.1038/29528) Crossref, PubMed, ISI, Google Scholar

    • 13

      Poincaré H. 1912Calcul des probabilités. Paris, France: Gauthier-Villars. (http://www.archive.org/details/calculdeprobabil00poinrich). Google Scholar

    • 14

      Shadmehr R, Wise SP. 2005The computational neurobiology of reaching and pointing: a foundation for motor learning. Cambridge, MA: The MIT press. Google Scholar

    • 15

      Brancazio PJ. 1981Physics of basketball. Am. J. Phys. 49, 356–365. (doi:10.1119/1.12511) Crossref, ISI, Google Scholar

    • 16

      Okubo H, Hubbard M. 2006Dynamics of the basketball shot with application to the free throw. J. Sports Sci. 24, 1303–1314. (doi:10.1080/02640410500520401) Crossref, PubMed, ISI, Google Scholar

    • 17

      Plamondon R, Alimi AM. 1997Speed/accuracy trade-offs in target-directed movements. Behav. Brain Sci. 20, 279–303. discussion 303–349. Crossref, PubMed, ISI, Google Scholar

    • 18

      Todorov E. 2004Optimality principles in sensorimotor control. Nat. Neurosci. 7, 907–915. (doi:10.1038/nn1309) Crossref, PubMed, ISI, Google Scholar

    • 19

      Fitts PM. 1954The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 47, 381–391. (doi:10.1037/h0055392) Crossref, PubMed, Google Scholar

    • 20

      Scholz JP, Schöner G. 1999The uncontrolled manifold concept: identifying control variables for a functional task. Exp. Brain Res. 126, 289–306. (doi:10.1007/s002210050738) Crossref, PubMed, ISI, Google Scholar


    Page 9

    There has been an explosion in the number of mathematical and computer models proposed to describe a wide range of neurological phenomena [1]. However, the nervous system is composed of approximately 1011 neurons [2], a network far too large to model directly. To overcome this, and to make use of high- (though limited) resolution data, some models have considered smaller graphs, where each node represents a region of the brain. For example, McIntosh et al. [3] use as few as four nodes and Grindrod et al. [4] use 105, where each node is the voxel from an fMRI scan. Other observational methods use multivariate time series, such as readings from EEGs, MEGs and fMRIs [5]. These readings are essentially representing the brain as communication between approximately 64 regions. Region-to-region communication is tracked by measuring activation level, firing rate or membrane potential across time. However, the signals are not on the neuron level, they are noisy and involve conduction delays. Methods to analyse these time series include phase lag index, phase-locking value and partial directed coherence [6–8]. Complex networks of multivariate time series have undergone a rapid growth in recent years, and have been successfully applied to solve challenging problems in many research fields, and brain network is one of them. In particular, Gao et al. [9] proposed a multiscale complex network, a multi-frequency complex network [10], a multiscale limited penetrable horizontal visibility graph [11] (an alternative to the visibility graph [12]) and a multivariate weighted complex network [13] to analyse multivariate time series. These methods have proved to be powerful analytic frameworks for characterizing complicated dynamic behaviours from observable time series. Nonetheless, these observational approaches are essentially providing a top-down analysis. That is, they inform us where energy is being used, but provide no details about what processes are actually taking place.

    So although analysing structural networks may help us to understand the fundamental architecture of inter-regional connections, we must also consider functional networks directly to elucidate how this architecture supports neurophysiological dynamics [14]. Inspired by the advent of micro-electrode recording techniques, which made it possible to record electrical events from single neurons, many models focus on systems composed of two or three neurons ([1] and references therein). Let us elaborate on models of this type.

    For models where only two to three nodes are considered, a common starting point is the relationship between the neuron’s input and output. The times of neuron firings can be presented as a neural spike train (figure 1). Considerable experimental and theoretical work has been devoted to measuring and modelling the statistical properties of neural spike trains in the hope of deciphering the neural code [15,16]. Commonly, the frequency or inter-spike interval is analysed; for example, inter-spike intervals for many neurons can be described by density functions, which may be generated by chaotic deterministic processes [17]. There is experimental evidence to suggest that even very random looking neural spike trains may represent deterministic chaos [18].

    What happens when a red blood cell dies?

    Figure 1. Anexample of three neural spike trains.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Recent developments in the quantitative analysis of complex networks, based largely on graph theory, have been rapidly translated to studies of brain network organization [14]. Graph theory offers new ways to quantitatively characterize anatomical patterns. Structural neural networks can be described as directed graphs that are composed of nodes denoting neurons that are linked by edges representing physical connections (synapses or axonal projections) [14]. The notion that structure can predict function also informs one of the chief goals of connectomics, which is to furnish network models that can bridge brain structure and function [19,20].

    Our work is in the spirit of the theoretical models of two or three nodes, but on a considerably larger scale. We consider the neuronal network on the micro-scale (neurons), the meso-scale (strongly connected subgraphs, SCGs) and the macro-scale (SCGs connected loosely as a whole, figure 2). Our contribution is to quantify how behaviour on the micro-scale impacts the meso-scale. We simulate the micro- and meso-scale behaviour by considering the individual excitable and refractory neuron firing times, via neural firing trains, but on a scale much larger than two or three nodes. By computationally simulating firing times, we record many neural firing trains and represent this time series in an n-dimensional space. We find that all else being equal, the functionality of SCGs increases sublinearly with the size of the SCG. This has profound implications with regard to explaining how brains evolved to ensure high functionality.

    What happens when a red blood cell dies?

    Figure 2. A directed network containing many SCGs together with its block upper triangular adjacency matrix. We model the neural network by examining the activity within the SCGs, where we experiment with varying sizes of SCGs.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The structure of our network is further described in §2. The dynamics of the neurons are described in §3. In §4.1, the dynamics within an SCG are modelled to generate a time series for the firings of each vertex within an SCG. In §4.2, we use the inter-spike intervals for a single neuron to construct a sequence of vectors according to the standard delay-coordinate embedding method [21]. Assembling these vectors as a matrix, we identify an upper bound for the embedding dimension μ for the time series. After establishing the neuron dynamics, and the method to find the embedding dimension for an SCG, in §5, we apply this method to many randomly generated SCGs, with varying number of vertices n. In §6, we consider the case where an SCG receives regular input with period p. Under this circumstance, we show that sometimes the activity within the SCG does not settle towards a corresponding periodic system. Lastly, in §7, we discuss our findings: that the embedding dimension increases sublinearly with the size of the SCG, implying that the neural network is more efficient when connected as demonstrated in figure 2.

    Consider a very large population of interacting neurons, from some region of a human brain, that together form a directed graph of one-way synaptic connections, representing the fact that neuron A’s firing is an influence upon, or input to, an immediate downstream neuron B’s firing state. At the microscopic level of single connections, the dynamic response of B to A’s firing behaviour must involve some measurable delay: for the time taken for electrical excitable waves to be transmitted out from A’s soma, out along its axon, across a synapse, and then in along B’s dendrites to reach B’s soma, plus the time taken for B’s soma to respond electrochemically. These time delays may well be the key to building the rich behavioural capacity of such networks in exhibiting distinctive, alternative, modes of processing (as has been observed in complementary approaches using delay-continuum field theories [22]).

    Now consider the macro-scale wiring structure of the directed graph corresponding to a very much larger collection of millions of such neurons. Within that network each neuron is represented by a unique vertex.

    We will consider various directed graphs, denoted by G, on n vertices, V ={vi | i=1,…,n}, having at most one edge between each pair of vertices and no loops (edges from a vertex to itself). We denote the edge from vi to vj by the ordered pair (vi,vj). If there is a connected walk from vi to vj then we will say that vj is downstream from vi and that vi is upstream of vj. A strongly connected directed graph G is one where any vertex is connected to any other vertex by at least one walk (so that every vertex is both upstream and downstream of every other vertex). For any such graph G, we let A denote the n×n adjacency matrix, where Aij=1 if and only if the edge (vi,vj) is present and is zero otherwise. If G is strongly connected then A is irreducible. In that case we must have ∑k=1n−1Ak>0. Alternatively, one may check for strong connectivity with two applications of the depth-first search algorithm (DFS) [23]: pick an arbitrary vertex, v say, run DFS once starting at v and establish that all other vertices can be reached downstream from v. Then reverse all the edge directions and rerun DFS starting at v, thus checking that all other vertices are upstream of v.

    Within a very large directed graph, we may be able to identify (via network analysis) many such strongly connected, or ‘irreducible’, sub-graphs, here called strongly connected (sub-)graphs, SCGs for short (figure 2). They exist at an intermediate, mesoscopic, scale in-between the pairwise microscopic interactions and the macroscopic whole network. We assume that SCGs are maximal in the sense that they are not a proper subset of any larger SCG. Then in any connected directed graph, a maximal SCG defines a three-way partition of the remaining vertices: those that are upstream of the SCG (there is a directed path from such vertices to at least one, and hence every vertex in the SCG); those that are downstream (there is a directed path from at least one, and hence every vertex in the SCG); and other vertices (e.g. downstream of upstream vertices, or upstream of downstream vertices) with no direct paths to or from the SCG. Therefore, an SCG receives input from a perturbation from an upstream SCG, and produces output by perturbing an upstream SCG. If we try to reduce an SCG by removing one or more of its vertices, by definition such vertices will be both upstream and downstream of the SCG, and we will no longer have a well-defined partition. Note that these definitions and ideas are related to the Markov blanket concept, that separates active-internal states and external states, explored for networks of neurons in Friston’s work [24].

    We address a simple question: what work, in terms of processing inputs to producing outputs, might such an SCG, made of couples firing (dynamical) entities, actually do? To begin, we consider the dynamics of information propagation defined on SCGs.

    In this paper, we discuss how the dynamical system within SCGs typically has multiple nonlinear resonant modes. The dynamical system is composed of coupled nonlinear dynamical unit ‘processors’ at each vertex (in this case called firing, excitable, refractory neurons) and incorporate transmission delays (resulting in delay-dynamical systems). Essentially, we should think of an SCG as a forced system when it is stimulated. If stimulated just once, and then left to its own devices, the nodes in the SCG will pass around excitation with suitable delays, possibly forever and ultimately in an a-periodic or periodic fashion; or it may cease firing and return to equilibrium. In the former case, the connectivity of the network may allow the cyclic propagation of excitation around closed paths (cycles). Hence the attracting set should be expected to be diffeomorphic to a μ-dimensional torus, denoted by , for some suitable choice of dimension μ>1. This is simply the cartesian product on μ copies of the circle S1, parametrized by a k-vector of 2π-periodic phase variables ϕ=(ϕ1,…,ϕμ)T. We expect such an attractor will support a winding dynamic, where all phases continually increase (the simplest case being ϕ˙=ω, some strictly positive winding rates vector in ). Indeed, we demonstrate directly that this is very often the case in estimating μ or rather an upper bound m for μ, from our detailed model calculations.

    When such an SCG is forced with repeated, say T-periodic, stimulating pulses from some other upstream SCG then, for some such patterns of stimulation, it may produce a coherent, phase-locked mode of response. Such modes are firing patterns of behaviour distributed both in space (across the SCG) and in time. On the other hand, for some patterns of this stimulation the SCG may only respond incoherently (with no phase locking). This generic occurrence (or not) of phase entrainment behaviour would certainly be possible if the autonomous SCG system resulted in a single stable nonlinear limit cycle. In such a case, suitable periodic stimulations would result in phase locking on Arnold tongues, as described in Glass & Mackey [25]. However, the impact of the delays within the SCG is likely to increase the capacity of an SCG’s dynamics to exhibit alternative nonlinear cyclic modes, and have an attractor embedded in a torus, , with μ>1. Recent work [26] considered the generalization of type-one and type-zero phase transition mappings (valid for instantaneous phase resetting in the limit cycle) to more diverse phase locking alternatives on the higher dimensional tori.

    Generalizing for a moment, there are many applications of directed graphs where the vertices represent separate, similar, entities that are described in terms of r dynamical state variables, xj(t)∈Rr say, at vertex vj (j=1,…n), with the graph’s edges describing the directed coupling between them and even where the dynamics feedback into the evolving edge couplings [27]. Typically, we will be in the situation where the state at vertex vj is given by xj(t)∈Rr and depends only on the states of the vertices vi that are immediately upstream of vj (those i for which Aij=1). Such coupling may include time delays, lij>0, corresponding to the time taken to receive input from vi upstream. To be more specific, consider such a delay-differential equation having the form

    x˙j(t)=f(xj(t))+∑i | Aij=1h(xj(t),xi(t−lij))j=1,…,n.

    Here, in the absence of coupling, f:Rm→Rm defines an autonomous nonlinear system at the jth vertex. The function h:Rm×Rm→Rm describes the forcing from each of the immediate upstream vertices subject to the prescribed time lag.

    Suppose we have such a system defined on a very large directed network, where some subset of the vertices defines a maximal SCG. Consider the dynamics on that SCG (ignoring any downstream activity which can have an influence on that of the SCG). Once ‘kick started’, and having no further input from any vertices upstream of the SCG, it will propagate activity (the separate finding of its various neurons) autonomously around itself. The necessary appearance of cycles (at least one) within such a strongly connected subnetwork means that quasi-periodic behaviours will dominate at large times as the maximal SCG continually passes activity around, stimulating and re-stimulating itself, over and over.

    Returning to the specifically application we have in mind, this is a useful model for a neuronal network within part of the human brain, where the vertex dynamics represented by f are given by the Hodgkin–Huxley equations, the FitzHugh–Nagumo equations, or any other proposed excitable-refractory dynamical system [28]. Then the transmission time-lag represents the time taken for the electrical membrane depolarization waves to travel from one neuron centre (the soma) out along its axon, across a synapse and then inwards via the dendrite of a connected neuron, stimulating it to spike (or not) at its own soma. Of course, if activity runs in a cycle and returns to a neuron that has fired previously, it may arrive during the neuron’s refractory phase—during which it cannot be re-stimulated. If the cyclic propagation takes some time longer, the stimulus will return after the neuron has recovered and this can go again. When we perform numerical experiments solving the full delay differential equations above for Hodgkin–Huxley equations or FitzHugh–Nagumo dynamics this is exactly what is observed. Certain cycles may not be viable while others are, and the whole is a nonlinearly coupled system of such cycles that, once kick started, eventually settles into a long-term quasi-periodic attractor.

    An extreme version of the above delay-dynamical system is to assert that f is such that:

    • (i) when perturbed sufficiently from its locally stable equilibrium state, it exhibits a very, very fast spike (that we may assume is instantaneous compared to the time scale of the transmission time-lags), followed by a fixed time-period, δ, during which the system cannot be re-stimulated (called a refractory period);

    • (ii) the coupling terms, represented by h, ensure that if any vertex immediately upstream should spike then, after the requisite time-lag, this is enough to excite the receiving vertex to spike immediately, providing that this stimulus does not arrive during a refractory period.

    If we treat the spikes as instantaneous pulses, then it is enough to keep track of the firing times (and the consequent refractory periods) for each vertex. This is a far less expensive calculation than resolving the original delay differential equations with its wide multiplicity of time lags, and a division between the fast (pulse firing) and slow (neuron to near lagged preparations) time scales.

    In this paper, we adopt this discrete approach to generate long-term behaviour (the inter-spike firing times for all vertices) for a very large variety of strongly coupled directed networks. All of the networks will be made of similar neuronal graphs, where we keep the expected vertex in and out degrees, z, to be constant, so that locally all networks appear similar regardless of overall size. The result of this work is to show that as the size, n, of such networks is increased (while z is fixed), the degrees of freedom exhibited by the long-term dynamical behaviour increase only sublinearly. The corollary to such a claim is that with limited volume and energy available, there is a diminishing marginal return on increasing the size of any strongly coupled subnetworks. Consequently, directed neuronal networks within human brains would have evolved so as to be wired with many, small maximal SCGs, rather than exhibiting any occasional large (or giant) ones. Large maximal SCGs simply do not provide a requirements increase in the variety of dynamic behaviour (degrees of freedom representing information and information processing within the macro-scale network) that would be worthy of the investment (of volume and energy, both of which are constrained).

    In making such simulations, it becomes necessary to sample and simulate dynamics on a very large number of strongly connected networks as their size, n, is increased over orders of magnitude while z is held constant. This is a challenging topic in itself, as random networks of a given size are rarely strongly connected. We use a method of ‘edge swapping’ to generate Markov chains within the set of strongly connected directed graphs having given degrees distributions, so as to generate candidate SCGs for large n. This is set out in appendix A.

    Consider a strongly connect directed graph representing connected neurons incorporating an excitable-refractory-delay dynamic. We will assume that a single firing pulse (measurable in membrane potential) is an instantaneous single spike event. Following each such spike, the neuron sits in a refractory state and cannot be re-stimulated for a time interval of length δ>0.

    Once a neuron spikes that pulse is propagated out to its immediate downstream connections. These pulses arrive after a requisite transmission time lag. The time lags will be drawn independently and identically from a suitable distribution, and are assumed to be large compared with the pulse width, and this leads to our assumption that the pulse is an instantaneous spike. Upon arrival at each downstream neuron it is assumed that (i) if the downstream neuron is not sitting within a refractory period then the incoming spike is strong enough to cause the downstream neuron to spike itself and (ii) if the downstream neuron is sitting within a refractory period then the incoming spike has no effect.

    So, given the network (the adjacency matrix) and the corresponding matrix of time lags, for each directed edge, we may iteratively update the whole network’s spike times by the following dynamical model.

    At any time t=t1, for each neuron, we will have a list of all of the historical spike times that have already occurred (in the past, for t<t1), and also a list of possible spike times (in the future, for t>t1) when each neuron will receive an incoming spike (propagated from an upstream spike in the past). We select the next earliest future incoming spike times for any of the neurons, at some time t2(a)>t1 say, and then we check whether that corresponding receiving neuron will be in its refractory period at that t2(a) (due its own most recent earlier spike in its historical list). If it is, then that receiving neuron will not spike, and we do not record a firing at time t2(a). Instead, we select the second earliest future incoming spike times for any of the neurons, at some time t2(b)>t1. This process is repeated until the receiving neuron is not in a refractory period for some time t2(⋅). Then we assume that it spikes at t2(⋅)=t2, we move the clock on to t=t2, we update the receiving neuron’s corresponding list of historical spike times to include the spike at t2, and we generate some new possible future spike times for each of its immediately downstream neurons at times given by t2 plus the relevant transmission lag, and add these to the full list of possible future spike times. Then we iterate.

    To start up, we pick a single neuron and we assume (i) it spikes at t=0, so it has a single element in its list of future spike times, (ii) that all neurons are not refractory at t=0 (in effect they have no historical firing events, and thus have not fired recently within time δ) and (iii) that all of the neuron’s lists of future possible firing times are empty.

    In figure 3, we show a simple example. Following a single kick-start spike at t=0 at vertex 1, we depict the resulting firing times as vertical bars, coloured by vertex: essentially 20 neural spike trains (figure 1) differentiated by colour, not location on the y-axis. The result is quasi-periodic.

    What happens when a red blood cell dies?

    Figure 3. Here the number of vertices in the SCG is n=20, the average in/out degree is z=3, and the refractory period is δ=20. The diameter of the graph is 6. Transmission lags are identically and independently distributed (i.i.d) uniformly in [50,100]. (a) The strongly connected graph. (b) The 20 neural spike trains times (after suitable a burn-in period) with spikes at each vertex coloured separately—not obviously periodic. (c) The natural log of the first 20 eigenvalues of the lag-correlation matrix (with windows of dimension k=80) for the inter-spike intervals (observed at a single vertex), indicating that the attractor is embeddable in 14 dimensions (the mean inter-spike interval is 22.08 at all vertices).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    In order to analyse such results, we consider the sequence of interspike intervals q (at a single vertex) so as to estimate a suitable embedding dimension for the state space reconstruction of the long-term dynamical behaviour, close to the attractor. This is based on generalizations for the Takens embedding theorem [21] and analogous ideas [29], and is neatly summarized in Bradley & Kantz [30]: singular spectrum analysis enables such a reconstruction. The standard method is based on delay-coordinate embedding, where a series of ‘windows’, each containing k successive past values of a scalar observable (from a dynamical system), forming a vector in Rk. Such delay-coordinate embedding usually requires the data to be evenly sampled in time. However, when the data consist of discrete events, such as the spikes here, one can justify the application of window-embedding to the sequence of interspike intervals itself [31,32]. Therefore, the corresponding lag-correlation matrix

    (q1q2…qkq2q3…qk+1q3q4…qk+2)

    is formally missing not only an embedding dimension m (because that is what we seek), but also a time delay parameter. Now, the k eigenvectors of the lag-correlation matrix define empirical orthogonal dimensions, while the corresponding eigenvalues (all positive and real by definition) account for the partial variance of the embedded point cloud in the corresponding direction. Signal-to-noise separation can be obtained by simply locating a significant break in the ordered list of eigenvalues (pink or white noise would produce a natural decay or plateau of the spectrum, without such large breaks). The position m (<k) after which this spectral break occurs is thus an estimated upper bound for the actual dimension μ of the underlying deterministic dynamics, and the corresponding first m eigenvectors define a set of coordinates with which to project and embed the sequence of windows. Thus, m represents an estimate of the dimension of the embedding space containing the underlying attractor.

    Of course if the attractor has actual dimension equal to μ then typically one may require a trial window-embedding dimension of at least k>m, which could be as large as 2μ. However, we might hope that m is closer to μ in our calculations, since is certainly embeddable in Rμ+1.

    Consider the example given in figure 3. The series of firing times, the spike train, for the SCG are shown coloured by vertex. We take the window length to be k=80. Considering the ordered list of the (logarithms of the) eigenvalues of the lag-correlation matrix, there is an obvious break after m=14. This leads us to assert that μ≤14 and is very likely to be 13, corresponding to an attractor in the form of T13.

    Hence for any sampled SCG, generated on n vertices, we will generate an appropriate set of transmission lags, independent and identically, from a given distribution and calculate firing times following a kick start. Then we will employ the above state space embedding methodology to estimate an upper bound, m, for the actual dimension, μ, of the corresponding attractor. In the next section, we consider how m behaves as the number of vertices, n, varies, while keeping z, the expected in and out degree of all vertices, constant.

    In this section, we carry out a large survey, analysing SCGs, with excitable-refractory delay dynamics, across a range of sizes having similar edges densities (as measured by the mean in and out vertex degree, z), a fixed value for the refractory period, δ, and similar transmission lags. For each SCG dynamic, we estimate m and thus we sample the conditional distribution P(m | n,X), where X stands for everything that we prescribed for the SCGs and the associate neuronal dynamics. We will examine how this distribution varies as the number of vertices, n, varies over orders of magnitude.

    More precisely, we assume that (i) the refractory period δ=30, (ii) all transmission lags are identically and independently distributed (i.i.d.) uniformly [50,100] and (iii) the expected in and out degree of all vertices, z, is approximately equal to 3.

    When n is large it is not trivial to sample from the corresponding set of strongly connected, directed, graphs with mean in and out degree equal to 3. Consequently, we have adopted a method based on edge swapping that enables the generation a Markov chain of such irreducible graphs, starting out at highly contrived graphs and evolving towards more homogeneous graphs, with the graph diameter decreasing along the chain. This allows the generation of networks with relatively small diameters over many hundreds of vertices and is set out in appendix A.

    Some typical results are shown in figure 4. It is clear that m increases sublinearly with n. This is an important insight because below we argue that m gives a measure of the different types of behaviour that a dynamic SCG can exhibit. If the dynamic on the SCG is forced at different frequencies, and at different vertices internally, then it may respond in different ways. If m=2, the attractor being a limit cycle, with μ=1, then the dynamics either entrain with the periodic forcing or do not. The situation for higher dimensional tori is somewhat more complicated [26]—nevertheless, there are likely to be different modes that can be resonant with distinct periodic forcing. Those resonant modes of response may also depend on the particular internal vertex at which the forcing from some upstream vertex (within some upstream SCG) is received.

    What happens when a red blood cell dies?

    Figure 4. The median (middle line), interquartile range (box edges) and range (whisker edges) of the observed conditional distribution for m, P(m | n,X), on the vertical axis, as the number of vertices, n, varies along the x-axis. For each value of n (the number of vertices in the SCG), we sampled 100 SCGs and used refractory period δ=30, and average in/out degree z=3, with all transmission lags i.i.d uniformly in [50,100]; and k=80. The number of embedding dimensions for the SCGs scales sublinearly with n.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We consider an SCG with an excitable-refractory-delay dynamic defined on it, as in previous sections. Rather than kick starting it once and then allowing it to run to a dynamic equilibrium, here we force it periodically with time period p>0. For each period, we estimate whether the resulting behaviour of the dynamic responds coherently, and becomes entrained, exhibiting periodic behaviour itself with period equal to an integer multiple of p. If it does not do so then the dynamic may become chaotic and thus remains relatively incoherent.

    Consider the system presented in figure 3. We force it periodically by stimulating a spike every p units of time arriving at vertex 1 (as if arriving from some upstream vertex, within an upstream SCG, that is firing with period p). If vertex 1 happens to be within a refractory period when it is stimulated then that forcing pulse has no effect. For different values of p, we can observe whether the system becomes periodically entrained, that is, when the system’s exhibited period to forcing period ratio is of the form K:1 (and so the system response has period Kp).

    In figure 5, we depict the situation for a wide range values of p. In general, after 100 000 firings, the SCG dynamical system appears to respond ‘coherently’ and becomes entrained for certain intervals of the forcing period, p. The ‘gaps’ between the intervals imply either a higher order of entrainment or a chaotic system, and so it does not become entrained.

    What happens when a red blood cell dies?

    Figure 5. When the SCG is periodically stimulated with period p (the x-axis), the SCG dynamic generally settles to a period of length Kp. Note there are ‘gaps’ where no entrainment is observed, while between those gaps the entrained system’s period to forcing period ratio is 1 : 1 (with period p), implying ‘coherence’. When the ratio is not 1 : 1, a longer sequence of firings may be required, or the system is chaotic.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    In this paper, we considered a directed graph model for the human brain’s neural information processing architecture that is based on a collection of small, directed SCGs. We assumed that there are transmission delays in neuron-to-neuron stimulation and found that these are critical in increasing the capacity of the behaviour that each SCG can support. We demonstrated that, in isolation, the SCGs made of delay-coupled neuron dynamics typically have attractors that are equivalent to continual winding maps over relatively low-dimensional tori.

    We carried out a large-scale survey of such SCGs, containing n neurons, where n varies over orders of magnitude, while the expected vertex in and out degrees are fixed (so that all of the networks, when considered locally, have the same density of connections). For each such delay-differential system, the dimension μ of the corresponding attractor for the dynamical system represents a limit on the range of distinct modes of behaviour that can be exhibited. We found that the conditional distribution for the embedding dimension, m, for the attractor of dimension μ≤m, scales sublinearly with n. Thus, there may be benefits in brains having evolved so as to have a larger number of m-small irreducible sub-graphs, rather than fewer n-large irreducible sub-graphs or giant components.

    We conjecture that this result is a function of the architecture and is not particularly sensitive to (i) the particular choice of excitable-refractory dynamics employed, provided it involves relatively fast spikes and relatively slow neuron-to-neuron transmissions or (ii) the methodology employed to estimate m, an upper bound for the dimension μ of the long-run attractor. Indeed in future work we will demonstrate an identical result with continuous time delay differential equations and an examination of the Fourier spectra of the consequent long-term firing patterns.

    When the SCGs are coupled up at the macroscopic scale, where each is driven by a few other SCGs immediately upstream, we discussed the generalization of phase response curves and the existence of multiple distinct resonances. The point is that SCGs can only become phase locked and entrained with their drivers for certain resonant ranges of periodic stimulation. Thus, each SCG behaves as a kind of analogue filter, and thus an amplifier and propagator of a distinct and discrete number of alternative time-dependent behaviours. The nonlinearity of the dynamics means that resonant modes might be like competing hypotheses (in a Bayesian setting) with one winning out as more and more forcing stimulus is applied.

    There are no datasets supporting this article.

    P.G. conceived of the study. T.L. coordinated the numerical experiments. Both authors drafted the manuscript and gave final approval for publication.

    We have no competing interests.

    We are grateful for support from a number of EPSRC research grants: EP/G065802/1, The Digital Economy HORIZON Hub; EP/I017321/1, MOLTEN: Mathematics Of Large Technological Evolving Networks; EP/F033036/1, Cognitive Systems Science (Bridging the Gaps); EP/I016856/1, NeuroCloud and EP/H024883/1, Towards an integrated neural field computational model of the brain.

    We are pleased to acknowledge the advice and encouragement of our colleagues on the above grants, especially Doug Saddy, Des Higham and Clive Bowman.

    Let Gn denote the set of all directed graphs over n vertices V ={vi | i=1,…,n}. Let Hn⊂Gn be the subset of strongly connected directed graphs on V . A random directed graph is a probability distribution defined over Gn. The distribution for certain classes of random graph are easy to describe, and thus easy to sample from. For example, if each edge (vi,vj) is present independently with known probability pij then sampling from such a distribution is straightforward. If the pij are the same for all possible edges, say p (except being zero when i=j), then we have the homogeneous random directed graph that is usually denoted by D(n,p).

    In this paper, we wish to sample elements from Hn. Suppose we set pij=p(n), some constant probability, for i≠j (zero otherwise), then we might generate a graph from D(n,p(n)) over Gn and then test it for irreducibility. Further, we may take p(n)=z/(n−1) so that the expected in degree and out degree for each vertex remains the same, equal to z, as we vary n.

    For n relatively small this ‘trial-and-error’ procedure will be fine, yet it quickly becomes inefficient for n large (in fact, greater than 100, say). To see this, suppose we generate such a random directed graph on n vertices. Then the probability that a given vertex has zero in degree (or equivalently has zero out degree), implying that the whole is not strongly connected, is q=(1−z/(n−1))n−1. As n→∞ we have q→exp⁡(−z). Hence the probability that all vertices have both in and out degree greater than or equal to one is approximately (1−exp⁡(−z))2n. This last is an upper bound on the probability that the graph is strongly connected, and shows why the ‘trial-and-error’ approach is doomed for large n.

    We follow the method discussed by Viger & Latapy [33,34] and set within a wider frame by Blitzstein & Diaconis [35]. This in turn is based on the idea of edge switches (or swapping): taking two (or more) edges and swapping their destination vertices around to create replacement edges (see [36] and the discussion and references therein). This idea is highly traditional within graph theory, as switching preserves both the in and out degrees of all vertices and hence is compatible with the configuration model. Such as strategy, originally proposed by Miklos & Podani [37], was developed by Tabouriera et al. [36] and Artzy-Randrup & Stone [38] to other types of constrained subsets of directed graphs.

    It is usually trivial to create directly an element of Hn with a given in and out degree distribution. But these will inevitably be highly contrived, since we would need to ensure the existence of cycles, or possibly a Hamiltonian cycle. So we will start off with some such artificially structured element and then successively transform it into something more homogeneous, and less contrived, having a smaller diameter, say.

    Suppose we have a graph G∈Hn. Then we may modify G randomly to create a new graph G′∈Hn according to the ‘switch and test’ algorithm below. We shall write G′=F(G) where G′ is drawn from a conditional probability distribution defined over Hn, say P(G′ | G). We shall select edge switches randomly and then test whether the new proposed graph maintains strong connectivity.

    • Step 1. Select two edges randomly, with equal probability, from the set of edges in G , say (va,vb) and (vα,vβ).

    • Step 2. Propose new edges (va,vβ) and (vα,vb) (swapping the end vertices over).

    • Step 3. If either (va,vβ) or (vα,vb) is already in G, or is a loop, go back to Step 1.

    • Step 4. Form the graph G′ from G by deleting the old edges (va,vb) and (vα,vβ) and inserting the new edges (va,vβ) and (vα,vb).

    • Step 5. Check whether G′ is strongly connected. If it is not then go to Step 1. If it is then EXIT.

    The test in Step 5 can be to simply check the strict positivity of one of the matrix functions of the adjacent matrix A′, for G′, as discussed above. But in fact we can do better, since we only need to check that there are walks from va to vb and from vα to vβ in G′, since any walk in G connecting any pair of vertices that previous relied on either or both of the deleted edges will be able to take these alternatives in G′: hence G′ will be strongly connected. This can be achieved by two applications of the DFS algorithm starting from va and vα, respectively [23].

    Then applying F successively we create a Markov chain of elements in Hn. If G contains m edges then so does F(G); similarly, it is clear that the in and out degrees of all vertices is unchanged by F. Hence if we require a strongly connected directed graph with any given distribution of in and out degrees, we first generate one by hand, and then apply F as often as we wish.

    For example, we have observed that it is easy, by trial and error, to create strongly connected directed graphs on only 50 vertices, with z=3 (yet it is very hard to do so with 200 hundred vertices). So we may proceed as follows. First we generate K>2 strongly connected graphs each on only R (≪100) vertices, with a fixed expected in and out degree, z say. Then we form-up a single graph from their disjoint union with n=K.R vertices in total. We call the separate subgraphs ‘clusters’. Next we add K extra edges connecting up the clusters in a directed cycle as follows. We order the clusters into a cycle, and then create a new directed edge by selecting a random starting vertex from one cluster and a random ending vertex from the next cluster in the cluster cycle. The resulting graph G is trivially in Hn, as required.

    Next we may apply F many times over. An example of this process is shown in figure 6 in the case that R=50, K=4, z=3 and thus n=200; and we depict the corresponding adjacency matrix after each 20 successive edge swaps (applications of F). Each graph has 602 edges. After 480 swaps, we arrive at a graph (bottom right) where the clusters’ ‘block’ structure has become finely dispersed.

    What happens when a red blood cell dies?

    Figure 6. A random walk through Hn with n=200 vertices, starting from a graph with K=4 separate irreducible clusters, each with R=50 vertices and z=3, and joined together by a single directed edge cycle. We show the adjacency matrices for the graph after 20 successive edge swaps (applications of F), demonstrating that the method ensures average in/out degree of z=3 while homogenizing spread of edges.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    In figure 7, we give an example starting out from a simple range-dependent random graph [39], where edge (vi,vj) (i≠j) is present with probability pij=(13)(|i−j|−1) (so that neighbours are always directly connected but second neighbours are directly connected with probability 13, etc.). The expected in and out degree, z, is thus 3. The strongly connected graphs in the Markov chain all have 3018 edges.

    What happens when a red blood cell dies?

    Figure 7. A random walk through Hn with n=1000, starting from a range-dependent random graph [39] where edge (vi,vj) (i≠j) is present with probability pij=(13)(|i−j|−1) so both in and out degrees have an expected value of z=3 (there are actually 3018 edges). There are 1000 edge swaps (successive applications of F) between each successive graph. We show the adjacency matrices for successive graphs.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The Markov chain ranges over the subset of graphs in Hn having the exact joint distribution of in and out degrees specified for the starting graph. Suppose we focus on a single directed edge, E0, in the starting graph, out of m (larges) given edges in total. Then after km switches (k>1), the probability that E0 is unaffected is approximately (1−2/m)mk→e−2k. So if k>(ln⁡m)/2, i.e. the total number of switches is more than (mln⁡m)/2, then we expect that less than one of the original edges will still remain. For m=3000, we should take k=5 and perform 15 000 swaps. In figure 7, this corresponds to the 15th adjacency matrix shown.

    In each of the above examples, the average degree (in and out) is approximately z=3. So the diameter, d, of each of the networks appearing within those Markov chains is bounded below by that, d0 say, for a tree (with no cycles and each vertex having z edges out). We will have d0>−1+logz⁡(z−1)n, which is around 6 for the parameters used in defining the chain of networks shown in figure 7. In fact in those networks, the diameter starts out very high and reduces down to 10 after around 2000 swaps (the third adjacency matrix shown) and remains there.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Milton J. 1996Dynamics of small neural populations, vol. 7. Providence, RI: American Mathematical Society. Crossref, Google Scholar

    • 2

      Nicholls JG, Martin AR, Wallace BG, Fuchs PA. 2001From neuron to brain, vol. 271. Sunderland, MA: Sinauer Associates. Google Scholar

    • 3

      McIntosh AR, Grady CL, Ungerleider LG, Haxby JV, Rapoport SI, Horwitz B. 1994Network analysis of cortical visual pathways mapped with PET. J. Neurosci. 14, 655–666. Crossref, PubMed, ISI, Google Scholar

    • 4

      Grindrod P, Stoyanov ZV, Smith GM, Saddy DJ. 2014Primary evolving networks and the comparative analysis of robust and fragile structures. J. Complex Netw. 2, 60–73. (doi:10.1093/comnet/cnt015) Crossref, Google Scholar

    • 6

      Stam CJ, Nolte G, Daffertshofer A. 2007Phase lag index: assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources. Hum. Brain Mapp. 28, 1178–1193. (doi:10.1002/hbm.20346) Crossref, PubMed, ISI, Google Scholar

    • 7

      Lachaux JP, Rodriguez E, Martinerie J, Varela FJ. 1999Measuring phase synchrony in brain signals. Hum. Brain Mapp. 8, 194–208. (doi:10.1002/(SICI)1097-0193(1999)8:4<194::AID-HBM4>3.0.CO;2-C) Crossref, PubMed, ISI, Google Scholar

    • 8

      Sakkalis V. 2011Review of advanced techniques for the estimation of brain connectivity measured with EEG/MEG. Comput. Biol. Med. 41, 1110–1117. (doi:10.1016/j.compbiomed.2011.06.020) Crossref, PubMed, ISI, Google Scholar

    • 9

      Gao Z, Jin N. 2015Multiscale complex network for analyzing experimental multivariate time series. Europhys. Lett. 109, 30005. (doi:10.1209/0295-5075/109/30005) Crossref, Google Scholar

    • 10

      Gao ZK, Yang YX, Fang PC, Jin ND, Xia CY, Hu LD. 2015Multi-frequency complex network from time series for uncovering oil-water flow structure. Sci. Rep. 5, 8222. (doi:10.1038/srep08222) Crossref, PubMed, ISI, Google Scholar

    • 11

      Gao ZK, Cai Q, Yang YX, Dang WD, Zhang SS. 2016Multiscale limited penetrable horizontal visibility graph for analyzing nonlinear time series. Sci. Rep. 6, 35622. (doi:10.1038/srep35622) Crossref, PubMed, ISI, Google Scholar

    • 12

      Lacasa L, Luque B, Ballesteros F, Luque J, Nuno JC. 2008From time series to complex networks: the visibility graph. Proc. Natl Acad. Sci. USA 105, 4972–4975. (doi:10.1073/pnas.0709247105) Crossref, PubMed, ISI, Google Scholar

    • 13

      Gao ZK, Fang PC, Ding MS, Jin ND. 2015Multivariate weighted complex network analysis for characterizing nonlinear dynamic behavior in two-phase flow. Exp. Thermal Fluid Sci. 60, 157–164. (doi:10.1016/j.expthermflusci.2014.09.008) Crossref, ISI, Google Scholar

    • 14

      Bullmore E, Sporns O. 2009Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186–198. (doi:10.1038/nrn2575) Crossref, PubMed, ISI, Google Scholar

    • 15

      Longtin A, Bulsara A, Moss F. 1991Time-interval sequences in bistable systems and the noise-induced transmission of information by sensory neurons. Phys. Rev. Lett. 67, 656. (doi:10.1103/PhysRevLett.67.656) Crossref, PubMed, ISI, Google Scholar

    • 16

      De La Rocha J, Doiron B, Shea-Brown E, Josić K, Reyes A. 2007Correlation between neural spike trains increases with firing rate. Nature 448, 802–806. (doi:10.1038/nature06028) Crossref, PubMed, ISI, Google Scholar

    • 17

      Lasota A, Mackey MC. 1994Chaos, fractals, and noise: stochastic aspects of dynamics. Applied mathematical sciences, vol. 97. New York, NY: Springer Science & Business Media. Google Scholar

    • 18

      Mpitsos GJ, Burton RM. 1992Convergence and divergence in neural networks: processing of chaos and biological analogy. Neural Netw. 5, 605–625. (doi:10.1016/S0893-6080(05)80039-5) Crossref, ISI, Google Scholar

    • 19

      Sporns O, Tononi G, Edelman GM. 1991Modeling perceptual grouping and figure-ground segregation by means of active reentrant connections. Proc. Natl Acad. Sci. USA 88, 129–133. (doi:10.1073/pnas.88.1.129) Crossref, PubMed, ISI, Google Scholar

    • 21

      Huke JP, Broomhead DS. 2007Embedding theorems for non-uniformly sampled dynamical systems. Nonlinearity 20, 2205–2244. (doi:10.1088/0951-7715/20/9/011) Crossref, ISI, Google Scholar

    • 22

      Grindrod P, Pinotsis D. 2010On the spectra of certain integro-differential-delay problems with applications in neurodynamics. Phys. D: Nonlinear Phenomena 240, 13–20. (doi:10.1016/j.physd.2010.08.002) Crossref, ISI, Google Scholar

    • 23

      Kleinberg J, Tardos VA. 2006Algorithm design, pp. 92–94. Reading, MA: Addison Wesley. Google Scholar

    • 24

      Friston K. 2013Life as we know it. J. R. Soc. Interface 10, 1742–5662. (doi:10.1098/rsif.2013.0475) Link, ISI, Google Scholar

    • 26

      Grindrod P, Patel EL. 2015Phase locking on the n-torus. IMA, JAM 81, hxv031. (doi:10.1093/imamat/hxv031) ISI, Google Scholar

    • 27

      Ward JA, Grindrod P. 2014Aperiodic dynamics in a deterministic adaptive network model of attitude formation in social groups. Physica D: Nonlinear Phenomena 282, 27–33. (doi:10.1016/j.physd.2014.05.006) Crossref, ISI, Google Scholar

    • 28

      Grindrod P. 1991Patterns and waves: the theory and applications of reaction-diffusion equations. Oxford, UK: Oxford University Press. Google Scholar

    • 29

      Castro R, Sauer T. 1997Correlation dimension of attractors through interspike intervals. Phys. Rev. E 55, 287–290. (doi:10.1103/PhysRevE.55.287) Crossref, ISI, Google Scholar

    • 30

      Bradley E, Kantz H. 2015Nonlinear time-series analysis revisited. Chaos 25, 097610. (doi:10.1063/1.4917289) Crossref, PubMed, ISI, Google Scholar

    • 31

      Sauer T. 1995Interspike interval embedding of chaotic signals. Chaos 5, 127–132. (doi:10.1063/1.166094) Crossref, PubMed, ISI, Google Scholar

    • 32

      Hegger R, Kantz H. 1997Embedding of sequences of time intervals. Europhys. Lett. 38, 267–272. (doi:10.1209/epl/i1997-00236-0) Crossref, Google Scholar

    • 33

      Viger F, Latapy M. 2005Efficient and simple generation of random simple connected graphs with prescribed degree sequence. In Int. Computing and Combinatorics Conf., COCOON 2005, Kunming, China, 16–19 August 2005, pp. 440–449. Berlin, Heidelberg: Springer. Google Scholar

    • 34

      Viger F, Latapy M. 2005Random generation of large connected simple graphs with prescribed degree distribution. Lecture Notes Comput. Sci. 3595, 440–449. Crossref, Google Scholar

    • 35

      Blitzstein J, Diaconis P. 2010A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Math. 6, 489–522. (doi:10.1080/15427951.2010.557277) Crossref, Google Scholar

    • 36

      Tabouriera L, Roth C, Cointet JP. 2011Generating constrained random graphs using multiple edge switches. J. Exp. Algorithmics 16, 1.7. (doi:10.1145/1963190.2063515) Google Scholar

    • 37

      Miklos I, Podani J. 2004Randomization of presence-absence matrices: comments and new algorithms. Ecol. Arch. 85, 86–92. (doi:10.1890/03-0101) Crossref, ISI, Google Scholar

    • 38

      Artzy-Randrup Y, Stone L. 2005Generating uniformly distributed random networks. Phys. Rev. E 72, 056708. (doi:10.1103/PhysRevE.72.056708) Crossref, ISI, Google Scholar

    • 39

      Grindrod P. 2002Range-dependent random graphs and their application to modelling large small-world Proteome datasets. Phys. Rev. E. Stat. Nonlin. Soft Matter Phys. 66, 066702. (doi:10.1103/PhysRevE.66.066702) Crossref, PubMed, ISI, Google Scholar


    Page 10

    Neutral theories of ecology and of evolution via genetic drift have been contentious topics among evolutionary biologists[1–3], owing to these theories’ assumption that selective forces may not need to act upon fitness differences between organisms [4] or genomes [5,6] in order to drive ecological community structure or the formation of species. Neutral theory, however, has continued to thrive in the academic literature [7,8], and it can be a useful tool for uncovering underlying mechanisms of evolutionary processes. In particular, its emergence in the agent-based modelling community has unravelled the minimal necessary mechanisms for speciation to occur, while providing a null hypothesis against which models of selection can be tested. However, many aspects of neutral, null models remain unexplored, which has hindered the development of null hypotheses that can classify ecological models based on the underlying dynamical rules of the system [9,10]. For example, in a 2005 review, DeAngelis & Mooij [11] categorized more than 900 agent-based models of ecological and evolutionary processes into seven major types (such as collective motion, foraging, population dynamics and speciation). In contrast with the widespread acceptance among population geneticists of genetic drift as a meaningful baseline model, many of the ecological models categorized by DeAngelis & Mooij did not investigate neutral evolution.

    Understanding the evolutionary process of speciation involves identifying which mechanisms yield observed patterns of biological disparity (number of distinct morphologies, phenotypes or body plans within a population) and diversity (number of distinct species or groupings of biological taxa in a population). The latter has been the focus of much agent-based work [12–14]. Because many studies have not incorporated evolutionary mechanisms into their models of extinction risk [15], there is no clear understanding of how a dynamically evolving population can re-evolve following a mass extinction. With mounting evidence that Earth is in the midst of the sixth major extinction [16,17], now formally proposed to be renamed the Anthropocene Epoch, the drastically increased extinction risk for many species highlights the urgency of investigating mechanisms of evolution following a mass extinction [18]. A possible first step to accomplish this is to understand the properties of survival-to-extinction phase transitions in a model of evolutionary dynamics.

    While sympatric speciation is thought to be a less common mechanism than others (allopatric or parapatric), it is nonetheless a significant driver of evolutionary processes, especially adaptive radiation [19]. Adaptive radiation in sympatry has been observed, for example, in East African cichlid fishes, the Hawaiian silverswords [19], and the Anolis lizards of the Caribbean [20]. It is also speculated to occur following an extinction event when biota are cleared from previously occupied niches [21]. While experimental observation shows that sympatric speciation can be significant for natural processes, computational models can address the important mechanisms that drive this form of speciation. For instance, Kondrashov & Kondrashov [22] showed the possibility of sympatric speciation with the addition of sexual selection and a fitness component via different implementations of trait choice between the sexes, while Dieckmann & Doebeli [23] used a genetic assortative mating model to demonstrate that competition for similar resources ‘can initiate sympatric speciation even if mating depends on an ecologically neutral marker trait’.

    It has been hypothesized that if the underlying mechanisms of pattern and community formation can be understood, then, by observation of how populations fill their niches, the process (or mechanism) by which species evolved can be inferred [24]. Clustering patterns have been demonstrated in evolutionary, agent-based models with distinctly different dynamical rules [9,12,24–30]. For example, Young et al. [26] used a diffusion equation to govern population dynamics, while de Aguiar et al. [12] implemented an assortative mating scheme that allowed independent organisms to pick mates based on spatial and genetic proximity. Even though the above-cited models used different dynamical processes, they all demonstrated that an essential ingredient for the development of emergent clustering patterns was a spatial asymmetry between the birth and death process. Births must disperse locally, while death is a global affair. While these studies established that there are essential dynamical rules for emergent clustering, there is little consensus (as noted above) over what the appropriate null condition is in agent-based modelling for testing selection theories that relate to evolutionary processes such as speciation or adaptive radiation.

    The neutral model designed by de Aguiar et al. [12] used a hermaphroditic population that combined both genetic and spatial dynamics to compare simulated spatial patterns (clustering of individuals) to patterns observed in nature. The model predicts a constant speciation rate over long periods of time; after an initial transient period of rapid growth dominated by mutation and recombination, the number of species reached a steady state. Their observations were compared with the mammalian fossil record of the Meade Basin in Kansas, with the striking conclusion that speciation events in the basin occurred without geographical barriers. As their model is inherently sympatric, these results were cited as ‘disproving this once-dominant view, which was held because of the expectation that speciation is promoted by physical barriers’, and as evidence that the observed speciation events were not caused by glaciation, because the simulated clustering patterns, modelled without geographical barriers, mimicked patterns found in the mammalian fossil record. This model is classified as an equilibrium system, because populations were held constant throughout the entire simulation, and thus no form of extinction owing to outside sources was possible [12]. However, because life is inherently a non-equilibrium process (no population may ever come back from extinction), these models are of limited realism. Furthermore, as pointed out by Jablonski [31], until agent-based models are able to show hierarchical relations of evolutionary processes, thus relating a microevolutionary simulation to macroevolutionary processes revealed by the fossil record, comparison with actual biological processes will remain speculative at best.

    The possibility of a non-equilibrium transition (e.g. the possibility of an ‘absorbing’ state such as extinction) was incorporated by Chave and co-workers in their spatially explicit model of sessile organisms, with the aim of understanding mechanisms that underlie community formation [24]. They compared the patterns that emerged when offspring were dispersed locally versus globally, and when the underlying dynamics between species was governed by neutral conditions versus models where birth and death rates among the different species were asymmetric with simple trade-offs (for example, species with a lower reproductive rate could compete better for space). They found that the governing mechanism of biodiversity was offspring dispersion, rather than the presence or absence of neutral conditions. This recalls the results cited above, which showed that local dispersion of births and global deaths distributed randomly throughout the population were the key requirements for emergent clustering, yet also highlights the fact that an underlying selective process is not necessary for emergent clustering. This result is also highlighted by a comparison of the work of Dees & Bahar [32], which demonstrated emergent clustering in a rugged fitness landscape, to that of Scott et al. [30] who, in a nearly identical model except for the introduction of neutral conditions, also demonstrated emergent clustering. Further, while the studies described above reveal key requirements for cluster formation, there have been few computational examinations of patterns of population disparity, with exception of Raup & Gould [33], Slowinski & Guyer [34], and Pie & Weitz [9]. This leaves an important aspect of biological pattern formation by the wayside—an aspect central to important problems of adaptive radiation, such as the re-investigation of the Burgess Shale fossils, where high population disparity and low species diversity prevail [21,35].

    Previous work using an agent-based model of evolutionary dynamics similar to the work presented below, but on a rugged fitness landscape, exhibited a phase transition from survival to extinction as a single control parameter was varied [32]. The control parameter, mutability (µ), characterized how phenotypically different offspring could be from their parent organisms, which reproduced via an assortative mating scheme. It was then shown that this phase transition behaviour transpired even with implementation of a neutral fitness landscape, and not only for the assortative mating reproduction scheme, but also for bacteria-like fission [30]. Continuous, absorbing phase transition behaviour was shown for both reproduction schemes and for two order parameters, the population size and the number of clusters (considered analogous to species). Consistent with the conclusion of Chave et al. [24] that selective forces do not determine the emergent properties of the system, random mating was also investigated in the neutral version of this model, and in this case an inherently different type of population-based phase transition—a non-continuous one—was observed, and no clustering behaviour occurred. A similar result that showed a lack of ‘evolutionary branching’ with random mating was presented by Dieckmann & Doebeli [23]. Most recently, it was shown that this survival-to-extinction phase transition belongs to the directed percolation (DP) universality class, thus classifying and demonstrating universality in a neutral model of evolutionary dynamics [36].

    In this paper, we, specifically, investigate the DP phase transition behaviour as a function of a new control parameter, the individual death probability δ, and examine the specific organismal patterns that arise in the survival regime on the morphospace. The model is neutral in that no organism has a selective advantage over another and all organisms are subjected to the same dynamical rules. Using this null model, we investigate the relationship between morphological and lineage diversity. This relationship is of particular importance for palaeontologists who must rely on morphological character alone to trace ancestry in the fossil record of deep time [9]. This long-standing problem of taxonomic classification was highlighted in the seminal paper by Raup & Gould, who used branching random walks to investigate lineages [33]. They showed that within a species of known lineage morphological ‘outliers’ could occur, and thus computationally identified a potential problem with characterizing species based on morphology alone.

    The agent-based model used here incorporates the fundamental characteristics of Darwinian evolution (heritability, variation and competition), but on a neutral fitness landscape. The model can be characterized by a branching and coalescing random walk, a mathematical process that directly maps onto a reaction–diffusion (RD) process in physics. The RD process A → 2A, A + A → A, and A → Ø can undergo a continuous phase transition belonging to the DP universality class [37–39]. This corresponds to the birth of new offspring (A → 2A), coalescence resulting from competition between two individuals leading to the death of one of them (A + A → A), and random death (A → Ø).

    We specifically investigate emergent properties, namely, the DP phase transition behaviour that occurs when individual agents reproduce, mutate and die on generational timescales that, for most species, could not be typically observed in human lifetimes. We show that two types of phase transition occur in this model: a non-equilibrium, absorbing transition belonging to the (2 + 1) dimensional DP universality class (where the terminology refers to two spatial and one temporal dimension) and an equilibrium transition that could allow for the investigation of morphological disparity patterns. Note that a non-equilibrium, absorbing transition means that the system has a state from which it can never escape (i.e. extinction), while in an equilibrium transition the population can move in either direction between the two states (i.e. from an aggregated state to a uniform population distribution) with an equal probability.

    Simulations took place on a two-dimensional, continuous phenotype space with finite boundaries, with 45 arbitrary units along each axis. Therefore, the description of individuals is based on a continuous spectrum of phenotypic traits rather than on their location in geographical space. Each simulation started with an initial uniformly randomly distributed population of 300 individuals. While this limits the analogy between our model and sympatric speciation, these initial conditions were selected in order to minimize the transient time for populations to reach a steady state. The number of initial organisms has no effect on the phase transition behaviour of the system [40]. In fact, as the dynamics of our system behaves as a Markov process, with each state depending only on the prior state, the distribution of organisms at any given generation could be taken as an initial condition for future generations.

    In order to mimic neutral evolution, the fitness landscape was held constant, meaning that each organism produced the same number of offspring (denoted by the fitness f). The behaviour of the system was investigated as a function of fitness and the individual death probability δ. When δ served as the control parameter, the fitness was set to f = 2, and each simulation lasted for 2000 generations unless extinction occurred first. When the system's behaviour was investigated as a function of fitness, the individual death probability was held constant at δ = 1%, and each simulation was performed for 250 generations.

    Reproduction occurred via one of three schemes. Random mating is considered as a control condition; in this case, each organism randomly chose a mate and an ‘alternate’ mate from the general population. In assortative mating, each organism chose its nearest neighbour as its mate, and its second nearest neighbour was identified as an ‘alternate’ mate. The alternate mate is relevant to how clusters were determined (discussed below). Note that in both the random and assortative mating schemes, mating pairs need not be monogamous. The third scheme, asexual reproduction, is a branching process in which each organism splits into two offspring.

    Offspring organisms were dispersed in each simulation depending on the fitness landscape and the mutability µ. For assortative and random mating, offspring were uniformly randomly dispersed in an area around the parents, limited by µ (illustrated in figure 1 for assortative mating). For asexual reproduction, new offspring were distributed randomly in a 2µ*2µ area centred at the parent organism. The fitness, f, determined how many offspring were placed in the dispersal area of the parent organism(s). The system was investigated at µ = 0.30, 0.60 and 0.90.

    What happens when a red blood cell dies?

    Figure 1. Schematic diagram for assortative mating. Steps (a–d) can be read from left to right. Parents are labelled as squares and offspring as circles. (a) One reference organism (yellow) selects its nearest neighbour (green) as a mate. Offspring are distributed in an area defined by the locations of the two parent organisms, extended by the mutability μ. (b) Yellow's offspring organisms are generated (red circles). (c) Green's offspring are generated (blue circles). Note that this example assumes that the yellow parent organism is also the nearest neighbour of the green organism; this will not always be the case, and thus mating pairs will not necessarily be ‘monogamous’. (d) After every parent has mated (each acting once as the reference organism), all parents are removed, leaving their offspring to act as parents for the next generation.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Generations do not overlap, and thus parent organisms were removed after offspring production. The offspring then endured a series of death processes, implemented in the following order. First, based on a competition limit κ, we randomly removed one of two organisms when neighbouring within a radial distance of κ = 0.25 units. This corresponds to competition for resources between phenotypically similar organisms living in the same niche. The value of κ was held constant for all simulations shown here, but could in principle be used as another control parameter to drive the system dynamics. Other vagaries of fate such as natural disaster or poor environmental conditions were represented by a stochastic death process where each organism was assigned an individual death probability δ, which served as a system control parameter and was varied on the interval [0.01, 0.50] in increments of 0.01. Each individual in every generation was randomly assigned a number between 0 and 1, and any organism with a value equal to or below δ was removed. Lastly, any organism found outside the boundary of the landscape was removed. Finite boundary conditions can be interpreted as necessary environmental selective pressures to counteract processes such as Fisherian runaway [41].

    Clusters were determined by mating pairs, which aligns this algorithm with the biological species concept: when a closed set is formed, a reproductively isolated population is obtained. The clustering algorithm is represented schematically in figure 2 for the assortative mating scheme. A group of three organisms—the reference organism, its mate, and its second nearest neighbour, its alternate mate—formed cluster seeds, and then an iterative process determined whether organisms within one cluster seed belonged to another. If so, then this other cluster seed was incorporated into the growing cluster. The closed group of mating organisms generated by this process is representative of a species and can be represented as a bonded-cluster of mates (as seen in figure 3). The same algorithm is also used to identify clusters in the asexual reproduction case. However, in this case, the nearest neighbours and second nearest neighbours represent the most phenotypically similar and second-most phenotypically similar organisms, respectively, instead of actual mates. For the random mating scheme, both mates and alternate mates were chosen at random, and the clustering algorithm was applied to these mates as in the assortative case.

    What happens when a red blood cell dies?

    Figure 2. Schematic representation of the formation of reproductively isolated clusters. This algorithm is used for both assortative mating and asexual reproduction. The nearest organism to a reference organism is its ‘mate’ (solid lines). The second nearest organism to the reference organism is its ‘alternate’ mate (dashed lines). Lines are coloured to indicate the mate and alternate mate of the correspondingly coloured reference organism; for example, the white organism's mate is the blue organism, and its alternate is the yellow organism.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 3. Schematic representation of clusters formed by mate choice and by percolating discs. Dots represent mating organisms. Mates (solid lines) and alternate mates (dashed lines) are included in the definition of a cluster. κ = 0.25 for all simulations, and represents the radii of the discs. Four clusters of bonded mates and seven disc clusters are shown. Note how the largest cluster of bonded mates spans the space from end to end, while the discs do not form a continuous overlapping chain. In such a case, bond percolation can occur in the absence of site percolation.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The Clark and Evans nearest neighbour index, R, characterizes ‘the manner and degree to which the distribution of individuals in a population on a given area departs from that of a random distribution’ [42]. The random distribution is defined by the random placement of N points on a space. The average nearest neighbour distance in a randomly distributed population is rE=1/2ρ, where ρ is the population density [42]. The ratio R = rA/rE, where rA is the actual measurement of the average nearest neighbour distance of the population being sampled, quantifies a population's departure from a random distribution. Thus, when R < 1, the population is distributed in a more clumped, aggregated manner, and when R > 1, the population is more uniformly dispersed across the space. At R = 1, the spatial distribution of the population is said to be random. For the maximum packing of a space (with a population arranged in a hexagonal lattice structure), the measure approaches a limit of R = 2.1491 [42].

    The existing organisms chose mates and cluster on a continuous space. The organisms in these clusters can be represented as discs with radius κ, and these discs may overlap up to the radial distance (see figure 3 for schematic representation). In the case of overlapping percolating discs, the percolation threshold is considered to occur at the point for which there is a continuous chain of overlapping discs that spans the space from end to end. In models of this type, the fraction of the landscape filled, which is a function of some probabilistic filling factor [43], can act as the order parameter for the system. For the two-dimensional continuum model with overlapping discs, ϕ = 1 − e−η, where η = ρa is the filling factor, ρ is the population density and a is the disc area. Critical percolation values have been calculated to be ηc≅1.12  and ϕc≅0.67 [44].

    Representative clustering for assortative mating is shown in figure 4. The horizontal axes show three different mutabilities (µ = 0.30, 0.60 and 0.90), while the vertical axes show various values of the individual death probability, δ, sampled above and below the non-equilibrium phase transitions shown in figure 5. The approximate critical values δc (defined here as the first value of δ for which the population survives for 2000 generations) for each value of µ are also shown in figure 4. Note the qualitative difference in clustering for different values of δ and µ. Near criticality, clusters are clumped or aggregated, while above criticality, organisms are distributed more uniformly, and the largest cluster of mates can span the landscape from end to end. This indicates that there is a critical point for which a bonded-cluster of mates can span the space. Traditionally, the theory of ordinary percolation defines percolation on a regular lattice, where site percolation is defined with respect to the nodes of a given lattice configuration, and bond percolation is defined with respect to the bonds connecting the nodes [45]. We adopt similar terminology for the continuous space, considering the organisms as nodes and the interactions between them as bonds. Thus, mate choice creates bonded-clusters of mates. Likewise, when a parent generates offspring this creates a genealogical bond between the parent and offspring nodes.

    What happens when a red blood cell dies?

    Figure 4. Clustering for assortative mating on a 45 × 45 landscape at 2000 generations. Individuals are represented by dots, with example clusters highlighted in red, white, yellow, purple and blue. The system is shown at critical values of δc = 0.23, 0.38, 0.43, for μ = 0.30, 0.60, 0.90, respectively. For δ = 0.20, the system exists within the survival regime for each value of μ shown.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 5. Phase transition behaviour for population size and number of clusters for the assortative mating scheme (a,b) and asexual reproduction (c,d). Solid circles represent μ = 0.30, hollow circles μ = 0.60 and triangles μ = 0.90. Insets show a sharp rise in the standard deviation, averaged over five simulations, indicating the critical point of the phase transition. Y-axis of insets scaled from (a) 0–100 (b) 0–60 (c) 0–60 (d) 0–6; horizontal axes of all insets scale from 0 to 0.6.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Absorbing, continuous phase transitions for the order parameters of population size and number of clusters are shown in figure 5, for assortative mating (figure 5a,b) and for asexual reproduction (figure 5c,d). The phase transitions are classified as non-equilibrium because the system can enter an absorbing state of extinction; once organisms reach extinction they can never come back. The continuous nature of the phase transitions are demonstrated by the sharp rise in the standard deviation of the order parameter at the critical point, δc (insets in figure 5). For µ = 0.30, 0.60 and 0.90, the critical point for assortative mating occurred at approximately δc ≈ 0.23, 0.38 and 0.43 (see also figure 4). For asexual reproduction, critical points occur at δc ≈ 0.26, 0.40 and 0.44.

    We find that, as µ increases, the system becomes more robust against extinction: for µ = 0.90, the population could survive even with a death probability of δ = 0.44 (figure 5a,c). However, with increasing system robustness, there is a significant loss in diversity because the number of clusters drops (as indicated by a decrease in the maximum number of clusters for µ = 0.60, 0.90, shown in figure 5b,d).

    Non-critical (or non-continuous) phase transition behaviour was observed as a function of δ for the population size (data not shown) in the random mating case. Random mating also resulted in significantly different clustering behaviour than the assortative and asexual reproduction cases because only one cluster was observed in each random mating simulation.

    The clustering behaviour of the organisms can be quantified using the Clark & Evans nearest neighbour index, R, as shown in figure 6. This index itself appeared to undergo a continuous phase transition as a function of δ for the assortative mating and asexual reproduction cases (figure 6a,b). As with the population size and cluster number order parameters, a non-critical phase transition occurred in the random mating case (figure 6c). The critical points shown for the R transition occur at the same values of δc as shown in figure 5. Note, however, that the transition from a clumped, aggregated distribution (R < 1), to a random distribution (R = 1), to a uniform distribution (R > 1) does not occur at the critical point of these absorbing phase transitions, and that this distribution change transpires, at least for some parameter values, regardless of the mating scheme. The dashed lines in figure 6 indicate where this distribution change takes place. For the bacterial and assortative schemes, the distribution transition occurs within the active (survival) phase of the extinction/survival phase transitions, while, for the random mating scheme, the transition occurs only for μ ≥ 2 and μ ≤ 10. For μ < 2, R < 1; for μ > 10, R > 1. Figure 7 shows a parameter space plot for the assortative mating scheme, with critical values of μc and δc represented by filled circles, and the values of µR and δR for which R = 1 represented by open circles. The R transition always exists within the active phase of the phase transition, as the (µR, δR) curve lies below the (µc, δc) curve, in the region of parameter space where the system is in its survival phase.

    What happens when a red blood cell dies?

    Figure 6. R transition curves averaged over five simulations. Assortative mating (a) and asexual reproduction (b) show continuous phase transitions. The random mating case (c) shows no critical phase transition behaviour. Dashed horizontal lines indicate R = 1.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    What happens when a red blood cell dies?

    Figure 7. Parameter space plot for assortative mating. Shaded area represents the extinction regime. Filled circles indicate the critical points of the continuous phase transition. Open circles indicate where R = 1, thus showing that the populations always change their distribution structure within the active phase of the phase transition.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We further investigated the clustering behaviour of the system as a function of the fitness f to determine whether a space-filling, continuum cluster of discs can percolate across the space.Figure 8 indicates that the critical value of the filling factor ηc ≈ 1.12 (dashed line) occurs between f = 2 and f = 3. Thus, at f = 2, the value of fitness for which the all of the previous simulations were performed, the continuum percolation threshold has not yet been reached. Thus, no continuous cluster of discs existed, yet spanning bonded clusters of mates, defined by nearest-neighbour and next-nearest neighbour bonds, were observed—as indicated by the example spanning clusters shown in figure 4.

    What happens when a red blood cell dies?

    Figure 8. The filling factor for the continuum percolation analysis as a function of fitness. The horizontal dashed line represents the percolation threshold at ηc ≈ 1.12. Data were averaged over 20 simulations for each fitness value, and simulations were run for 250 generations.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The existence of non-equilibrium, continuous phase transitions, for both fluctuating and neutral conditions, as a function of mutability has previously been demonstrated in a family of agent-based evolutionary models [30,32,36]. This work has presented a version of the model that incorporates not only neutral fitness conditions but also an individual death probability that is uniform for all organisms at every generation, and we have shown that the existence of a non-equilibrium, continuous phase transition (figure 5) as this parameter is varied.

    In general, any model that is in agreement with the DP conjecture, and undergoes an RD process of A → 2A, A + A → A and A → Ø should belong to the DP universality class [37–39]. The phase transition shown here follows the same RD process. Further, it is consistent with the first three properties of the DP conjecture outlined by Henkel et al. [38], namely that it: (i) displays a continuous transition from a fluctuating active state (survival) to a unique absorbing state (extinction), (ii) it is characterized by a non-negative one-component order parameter, and (iii) it follows short-ranged dynamical rules.

    The fourth part of the DP conjecture requires that a system be free of spatiotemporal disorder. The use of a null model brings us as close to this requirement as possible, given the stochastic nature of the simulation. First, the flatness of the fitness landscape—in other words, the absence of selection—minimizes spatial disorder. As for temporal disorder, previous studies of mutability-driven phase transitions [30,32,36] used a time-dependent death probability, δτ, defined such that individuals had up to a 70% chance of being killed in any given generation (i.e. the percentage of deaths varied from generation to generation). Here, instead of δτ, we assign an individual death probability, δ, so that in each generation every individual has the same probability of being killed. This modification allows for a more stringent matching to the definition of neutral fitness outlined by Hubbell [4], and hence provides a more suitable null condition because all individuals not only had the same number of offspring, but also shared the same probability of death. Subjecting a population to an individual death probability rather than a time-dependent death probability also reduces the temporal disorder in the system, bringing the model into closer agreement with the last requirement of the DP conjecture. Indeed, while it was recently shown by Vojta & Hoyos that strong temporal disorder gives rise to a novel universality class other than DP [46], evidence suggests that if the temporal disorder is comparatively weak, a system can still present critical exponents consistent with DP universality, as in the case of the present model on a neutral fitness landscape but with the time-varying death probability δτ [36]. These exponents quantify the scaling behaviour of the system in the neighbourhood of the phase transition, and values obtained by Scott et al. are consistent with the theoretically predicted values for DP [36]. Based on these considerations, we can conclude that the phase transition as a function of δ, shown in figure 5, is also consistent with the DP universality class.

    In the case of random mating, the dynamic rules of dispersion are no longer short-ranged and thus the third requirement of the DP conjecture is broken; no DP phase transition is observed in this case. This result is also consistent with the body of literature above showing that, in order for emergent clustering to appear, the dynamics must be governed by a local dispersion process and global death [9,12,24–30].

    Clusters of the (2 + 1) dimensional DP universality class can be represented as time-dependent bonded genealogical clusters spanning the space-time axes from the first generation to the last generation of the simulation. However, there are also interactions, such as mate choice, between organisms within each generation that go beyond the simple RD process of birth and death. These within-generation interactions (as discussed above) create bonded-clusters of mates on the two-dimensional phenotype space, resulting in another type of phase transition. This transition occurs in the two-dimensional phenotype space, and is unrelated to the spread of the genealogical cluster though time. This phase transition relates to ordinary two-dimensional percolation, in which, once a critical percolation threshold is reached, a cluster of mates can span the space from end to end. Below this threshold, clusters are aggregated and unlikely to span the landscape. This is visually confirmed by population snapshots in figure 4, where the five largest clusters are highlighted. For the values of µ and δ just entering into the survival regime, the highlighted clusters do not span the morphospace; for values of µ and δ well into the survival regime, the largest cluster spans the entire morphospace. This implies the existence of a critical value of δ for which it becomes possible for a cluster of mates to span the space. The two-dimensional bond percolation phase transition shown here is a continuous, equilibrium phase transition, and thus fundamentally different from the non-equilibrium transition from survival to extinction.

    We have also shown that a second type of continuous, spatial percolation phase transition, can occur only after the fitness is increased to f = 3 (figure 8). This is an example of a site percolation transition. For f = 2, no continuous, space-filling, cluster of discs can span the space, even though bonded percolation clusters of mates do span the phenotype space. To be clear in distinguishing the two types of two-dimensional spatial percolation problems at issue, it is important to emphasize that the filling factor is a space-filling measure of continuum percolation, where organisms, defined as discs, can never be closer than κ = 0.25 units of each other. Figure 3 illustrates how a connected cluster of mates can span the morphospace at f = 2, while the clusters of discs do not. Increasing f to 3 and beyond changes the number of diffusing organisms from A → (m + 1)A, where m ≥ 2, thus changing the RD process which drives the system. Changing the RD process from f = 2 to f = 3 may not remove the system from the DP universality class (T. Vojta 2015, personal communication), yet does change the structure of the DP clusters, because organisms can only sparsely fill the morphospace at f = 2, while organisms can pack more closely together for f > 2.

    The Clark and Evans index R characterizes the dispersal patterns of populations based on a nearest neighbour measure [42]. When R < 1, the populations are distributed in an aggregated manner; when R = 1, they are distributed as if the points were placed randomly on the space; when R > 1, the organisms are said to be uniformly distributed within the space. In this sense, the measure is strictly used to identify the patterns of the already dispersed organisms, and does not specifically characterize dispersion owing to the underlying RD process. In other words, it describes the average structure of the population at a moment in time, rather than the DP process of the population across multiple generations. Furthermore, R does not distinguish whether a clumped, aggregated population distribution is composed of many small reproductively isolated clusters or one giant cluster. For this reason, it is not a good measure of population disparity. Although the emergent properties of clustering are determined by the type of reproduction the system undergoes, the population distribution as measured by R yields a transition from R < 1 to R > 1 regardless of reproduction scheme. However, it should be noted there are some significant differences in the behaviour of R between reproduction schemes (figure 6). The R transition occurs in the active phase of the continuous phase transition (figures 6a,b and 7), but also only for a select range of μ values in the random mating case (figure 6c).

    By quantifying how populations fill the morphospace, the R measure directly shows the process by which a species evolves cannot be determined by community pattern alone, as hypothesized by Chave [7]. In other words, the clumping or grouping of morphological characters by nearest-neighbour distance provides little, if any, information about how a population evolved. This is highlighted by the fact that the assortative mating and asexual reproduction schemes exhibited many clusters, while the random mating scheme only had one giant cluster on order of the population size, yet all reproduction schemes exhibited an R transition, at least for some parameter values (figure 6). Further, the spatially asymmetric birth and death placements, generated via the RD process, allowed for emergent clustering, but do not completely determine organismal dispersal patterns on the landscape. Fig. 3.9 from Henkel et al. illustrates how the appearance of DP clusters changes along the phase transition line of the (1 + 1) dimensional Domany-Kinzel model depending on the interplay of the parameters involved [38]. Similarly, the visual appearance of the DP clusters in our model is also likely to change as a function of δ as population transition from an aggregated, to a random, to a uniform distribution within the phase transition survival regime. The interesting thing to note here is that, as the changes in distribution are shown to occur in the active phase (figure 7), the different kinds of DP percolation clusters are not restricted to the phase boundary.

    The interpretation of clusters, which exhibit reproductive isolation within a given generation, can be roughly analogous to species. However, there are important limitations to this analogy. In a subsequent generation, the descendants of clusters can merge once again, if their offspring are sufficiently close to one another. From this perspective, clusters are more similar to incipient species, where sub-populations might break up the sexual continuum in particular ways or fill niche gaps, but still be able to interbreed [47].

    It has been shown by King [40] that independent lineage structures do evolve in this model and that these lineages structure depend heavily on where they are located in parameter space. Near the critical point, lineages exhibited punctuated bursts of evolutionary characters, and macroscopic analysis of branching behaviour (distributions of times to most recent common ancestor) exhibited power-law scaling. This punctuated behaviour was observed for both individual and cluster lineages, indicating that the two levels of organization scale similarly in time.

    The model studied here involves evolution on a morphospace only. The addition of a genetic component could be used to impose a more definitive speciation criterion, rendering separate populations as unable to interbreed when they became too genetically dissimilar, like the models developed by de Aguiar [12,48]. It has been shown in such models that the minimal genome size needed for such a speciation event is significantly decreased when spatial restrictions were also imposed on the organisms [48]. By contrast, without a spatial constraint, an infinite genome size is needed [49].

    The properties of the DP phase transitions, the ordinary bond percolation properties of clusters of mating groups and the corresponding continuum percolation problem, as well as the R transition in the overall dispersion of the organisms regardless of mating scheme all demonstrate the inability of the organisms to completely saturate the morphospace. As noted above, for DP RD dynamics in this model, dispersion was restricted to f = 2, and f needs to be greater than 2 for a space-filling continuum disc cluster (site percolation) to occur. However, it is possible for a cluster of bonded mates to span the space. This is further consistent with the observation that the maximum R value obtained in the simulations for f = 2 was 1.42, while a value of R = 2.1491 is needed for a maximally filled space [42]. A space-spanning cluster of bonded mates could contain many disc clusters, as shown in figure 3. In other words, the phenotypic characters within a reproductively isolated cluster could cover a widely distributed, and even discontinuous, range.

    The sparse filling of phenotype space is consistent with the potential pitfalls (morphological ‘outliers’ occurring within known lineages) discussed by Raup & Gould [33] and Pie & Weitz [9] for classifying lineages based on morphological character alone. Further, as evidenced from soft-bodied creatures of the Burgess Shale, it is possible to have high disparity and low diversity [21,35], indicating that many body plans may exist with groups. This is a significant problem for palaeontologists because much of the taxonomy derived from the fossil record of deep time results from classification via morphological characters. In the context of mass extinction, clades that had high disparity early in their evolution (bottom-heavy clades) were three times more likely to survive a mass extinction, whereas the top-heavy clades were more likely to succumb to one of the ‘big five’ mass extinction events [50]. Yet the potential of populations to recover from mass extinction may depend at least as much on lineage structure as on morphology, and the understanding of the relation between lineage and morphology in this age of the ‘sixth extinction’ [17].

    Universality class identification of the non-equilibrium DP transition in this model provides important information about the system dynamics as it moves from survival to extinction. There are non-equilibrium transitions other than DP [38], which exhibit different dynamical signatures (scaling exponents). The particular dynamics of a survival-to-extinction transition may provide tools to investigate biological systems that are nearing collapse, or to predict the conditions under which recovery from mass extinction can occur. Critical dynamics has been used to identify early warning signs of population collapse in various systems [51,52].

    DP is a canonical, non-equilibrium transition that can occur in a wide range of systems; systems in other universality classes will have a distinctly different behaviour. In addition to the DP transition, the model investigated here exhibits spatial clustering and an equilibrium bond percolation transition. It is possible that the local clustering dynamics might change significantly if the non-equilibrium transition did not follow the dynamical rules of DP, as is the case when the dynamical rules of offspring dispersal are no longer short ranged (random mating). However, phase transition behaviour in the statistical physics sense has yet to be studied in clustering models like that of Young et al. [26]. It is known that short-range interactions are necessary for clustering [9,12,24–30] and also for a DP transition [38]. Beyond this, however, the relationship between specific characteristics of clustering behaviour and the non-equilibrium dynamics of particular phase transitions remains to be explored.

    The potential of the present model lies in its ability to track lineages as well as morphological patterns, and to quantitatively characterize the relationship between the two. In the future, continuum percolation may be used to develop a within-cluster diversity measure that could be used to address the relationship between lineage and morphological diversity (e.g. a diversity versus disparity measure), as well as to computationally address the limiting factors in morphospace saturation that contribute to high disparity in early evolution [53] or ontogeny [54]. A number of steps must be taken in order for models of this type to become practically useful to the palaeontological community. These steps will include development of analytical techniques to investigate existing branching patterns, comparison of null models with ones that incorporate selection, simulations of adaptive radiation of both lineages and morphologies during recovery from mass extinction, and the identification of hierarchical relations above the cluster level.

    All code used in simulation and analysis can be found online at a github repository: https://github.com/DawnMKing.

    D.M.K. and S.B. designed the study; D.M.K. and A.D.S. programmed the simulations and analysis code; D.M.K. carried out the simulations; D.M.K. performed the data analysis. D.M.K. drafted the manuscript; all authors participated in revising and finalizing the manuscript, and gave final approval for publication.

    The authors have no competing interests to declare.

    This work was supported by a Missouri Research Board Grant and by a ‘Studying Complex Systems’ grant from the James S. McDonnell Foundation.

    The authors would like to acknowledge Dr Nathan Dees for his role in developing the initial version of this model.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Ricklefs RE. 2006The unified neutral theory of biodiversity: do the numbers add up?Ecology 87, 1424–1431. (doi:10.1890/0012-9658(2006)87[1424:TUNTOB]2.0.CO;2) Crossref, PubMed, ISI, Google Scholar

    • 2

      Clark JS. 2012The coherence problem with the Unified Neutral Theory of Biodiversity. Trends Ecol. Evol. 27, 198–202. (doi:10.1016/j.tree.2012.02.001) Crossref, PubMed, ISI, Google Scholar

    • 3

      Ricklefs RE, Renner S. 2012Global correlations in tropical tree species richness and abundance reject neutrality. Science 335, 464–467. (doi:10.1126/science.1215182) Crossref, PubMed, ISI, Google Scholar

    • 4

      Hubbell SP. 2001The unified neutral theory of biodiversity and biogeography. Princeton, NJ: Princeton University Press. Google Scholar

    • 5

      Kimura M. 1968Evolutionary rate at the molecular level. Nature 217, 624–626. (doi:10.1038/217624a0) Crossref, PubMed, ISI, Google Scholar

    • 6

      Kimura M. 1983Neutral theory of molecular evolution. Cambridge, UK: Cambridge University Press. Crossref, Google Scholar

    • 7

      Chave J. 2004Neutral theory and community ecology. Ecol. Lett. 7, 241–253. (doi:10.1111/j.1461-0248.2003.00566.x) Crossref, ISI, Google Scholar

    • 8

      May F, Huth A, Wiegand T. 2015Moving beyond abundance distributions: neutral theory and spatial patterns in a tropical forest. Proc. R. Soc. B 282, 20141657. (doi:10.1098/rspb.2014.1657) Link, ISI, Google Scholar

    • 9

      Pie MR, Weitz JS. 2005A null model of morphospace occupationAm. Nat. 166, E1–E13. (doi:10.1086/430727) Crossref, PubMed, ISI, Google Scholar

    • 10

      D'Andrea R, Ostling A. 2016Can clustering in genotype space reveal ‘niches’?Am. Nat. 187, 130–135. (doi:10.1086/684116) Crossref, PubMed, ISI, Google Scholar

    • 11

      DeAngelis DL, Mooij WM. 2005Individual-based modeling of ecological and evolutionary processes. Annu. Rev. Ecol. Evol. Syst. 36, 147–168. (doi:10.1146/annurev.ecolsys.36.102003.152644) Crossref, ISI, Google Scholar

    • 12

      de Aguiar MAM, Baranger M, Baptestini EM, Kaufman L, Bar-Yam Y. 2009Global patterns of speciation and diversity. Nature 460, 384–387. (doi:10.1038/nature08168) Crossref, PubMed, ISI, Google Scholar

    • 13

      Baptestini EM, de Aguiar MAM. 2013The role of sex separation in neutral speciation. Theor. Ecol. 6, 213–233. (doi:10.1007/s12080-012-0172-2) Crossref, ISI, Google Scholar

    • 14

      Martins AB, de Aguiar MAM, Bar-Yam Y. 2013Evolution and stability of ring species. Proc. Natl Acad. Sci. USA 110, 5080–5084. (doi:10.1073/pnas.1217034110) Crossref, PubMed, ISI, Google Scholar

    • 15

      Urban MC. 2015Accelerating extinction risk from climate change. Science 348, 571–573. (doi:10.1126/science.aaa4984) Crossref, PubMed, ISI, Google Scholar

    • 16

      Barnosky ADet al.2011Has the Earth's sixth mass extinction already arrived?Nature 471, 51–57. (doi:10.1038/nature09678) Crossref, PubMed, ISI, Google Scholar

    • 17

      Kolbert E. 2014The sixth extinction: an unnatural history. New York, NY: Henry Holt and Company. Google Scholar

    • 18

      Lewis SL, Maslin MA. 2015Defining the Anthropocene. Nature 519, 171–180. (doi:10.1038/nature14258) Crossref, PubMed, ISI, Google Scholar

    • 19

      Schluter D. 2000The ecology of adaptive radiation. New York, NY: Oxford University Press. Google Scholar

    • 20

      Knox AK, Losos JB, Schneider CJ. 2001Adaptive radiation versus intraspecific differentiation: morphological variation in Caribbean Anolis lizards. J. Evol. Biol. 14, 904–909. (doi:10.1046/j.1420-9101.2001.00358.x) Crossref, ISI, Google Scholar

    • 21

      Gould SJ. 1989Wonderful life: the Burgess Shale and the nature of history. New York, NY: W.W. Norton and Company. Google Scholar

    • 22

      Kondrashov AS, Kondrashov FA. 1999Interactions among quantitative traits in the course of sympatric speciation. Nature 400, 351–354. (doi:10.1038/22514) Crossref, PubMed, ISI, Google Scholar

    • 23

      Dieckmann U, Doebeli M. 1999On the origin of species by sympatric speciation. Nature 400, 354–357. (doi:10.1038/22521) Crossref, PubMed, ISI, Google Scholar

    • 24

      Chave J, Muller-Landau HC, Levin SA. 2002Comparing classical community models: theoretical consequences for patterns of diversity. Am. Nat. 159, 1–23. (doi:10.1086/324112) Crossref, PubMed, ISI, Google Scholar

    • 25

      Meyer M, Havlin S, Bunde A. 1996Clustering of independently diffusing individuals by birth and death processes. Phys. Rev. E 54, 5567–5570. (doi:10.1103/PhysRevE.54.5567) Crossref, ISI, Google Scholar

    • 26

      Young WR, Roberts AJ, Stuhne G. 2001Reproductive pair correlations and the clustering of organisms. Nature 412, 328–331. (doi:10.1038/35085561) Crossref, PubMed, ISI, Google Scholar

    • 27

      Houchmandzadeh B.2002Clustering of diffusing organisms. Phys. Rev. E 66, 052902. (doi:10.1103/PhysRevE.66.052902) Crossref, ISI, Google Scholar

    • 28

      Fuentes MA, Kuperman MN, Kenkre VM. 2003Nonlocal interaction effects on pattern formation in population dynamics. Phys. Rev. Lett. 91, 158104. (doi:10.1103/PhysRevLett.91.158104) Crossref, PubMed, ISI, Google Scholar

    • 29

      Houchmandzadeh B, Vallade M.2003Clustering in neutral ecology. Phys. Rev. E 68, 061912. (doi:10.1103/PhysRevE.68.061912) Crossref, ISI, Google Scholar

    • 30

      Scott AD, King DM, Marić N, Bahar S.2013Clustering and phase transitions on a neutral landscape. Europhys. Lett. 102, 68003. (doi:10.1209/0295-5075/102/68003) Crossref, Google Scholar

    • 31

      Jablonski D. 2007Scale and hierarchy in macroevolution. Paleontology 50, 87–109. (doi:10.1111/j.1475-4983.2006.00615.x) Crossref, ISI, Google Scholar

    • 32

      Dees ND, Bahar S. 2010Mutation size optimizes speciation in an evolutionary model. PLoS ONE 5, e11952. (doi:10.1371/journal.pone.0011952) Crossref, PubMed, ISI, Google Scholar

    • 33

      Raup DM, Gould SJ. 1974Stochastic simulation and evolution of morphology: towards a nomothetic paleontology. Syst. Biol. 23, 305–322. (doi:10.1093/sysbio/23.3.305) Crossref, Google Scholar

    • 34

      Slowinski JB, Guyer C. 1989Testing the stochasticity of patterns of organismal diversity: an improved null model. Am. Nat. 134, 907–1981. (doi:10.1086/285021) Crossref, ISI, Google Scholar

    • 35

      Gould SJ. 1991The disparity of the Burgess Shale arthropod fauna and the limits of cladistic analysis: why we must strive to quantify morphospace. Paleobiology 17, 411–432. (doi:10.1017/S0094837300010745) Crossref, ISI, Google Scholar

    • 36

      Scott AD. 2014Speciation dynamics of an agent-based evolution model in phenotype space. PhD dissertation, University of Missouri, St Louis, MO, USA. See http://scholarsmine.mst.edu/doctoral_dissertations/2270/. Google Scholar

    • 37

      Hinrichsen H. 2000Non-equilibrium critical phenomena and phase transitions into absorbing states. Adv. Phys. 49, 815–958. (doi:10.1080/00018730050198152) Crossref, ISI, Google Scholar

    • 38

      Henkel M, Hinrichsen H, Lübeck L. 2008Non-equilibrium phase transitions: volume 1: absorbing phase transitions. Amsterdam, The Netherlands: Springer Science and Business Media B.V. Google Scholar

    • 39

      Ódor G. 2008Universality in nonequilibrium lattice systems: theoretical foundations. New Jersey, NJ: World Scientific. Crossref, Google Scholar

    • 40

      King DM. 2015Evolutionary dynamics of speciation and extinction. PhD dissertation, University of Missouri, St Louis, MO, USA. See http://scholarsmine.mst.edu/doctoral_dissertations/2464. Google Scholar

    • 42

      Clark PJ, Evans FC. 1954Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology 35, 445–453. (doi:10.2307/1931034) Crossref, ISI, Google Scholar

    • 44

      Mertens S, Moore M.2012Continuum percolation thresholds in two dimensions. Phys. Rev. E 86, 061109. (doi:10.1103/PhysRevE.86.061109) Crossref, ISI, Google Scholar

    • 45

      Stauffer D, Aharony A. 1994Introduction to percolation theory, revised, 2nd edn. Philadelphia, PA: Taylor & Francis. Google Scholar

    • 46

      Vojta T, Hoyos JA. 2015Infinite-noise criticality: nonequilibrium phase transitions in fluctuating environments. Europhys. Lett. 112, 30002. (doi:10.1209/0295-5075/112/30002) Crossref, Google Scholar

    • 47

      Noest AJ. 1997Instability of the sexual continuum. Proc. R. Soc. Lond. B 264, 1389–1391. (doi:10.1098/rspb.1997.0193) Link, ISI, Google Scholar

    • 48

      de Aguiar MAM. 2017Speciation in the Derrida-Higgs model with finite genomes and spatial populations. J. Phys. A: Math. Theor. 50, 085602. (doi:10.1088/1751-8121/aa5701) Crossref, ISI, Google Scholar

    • 49

      Higgs PG, Derrida B. 1991Stochastic models for species formation in evolving populations. J. Phys. A: Math. Gen. 24, L985–L991. (doi:10.1088/0305-4470/24/17/005) Crossref, Google Scholar

    • 50

      Hughes M, Gerber S, Wills MA. 2013Clades reach highest morphological disparity early in their evolution. Proc. Natl Acad. Sci. USA 110, 13 875–13 879. (doi:10.1073/pnas.1302642110) Crossref, ISI, Google Scholar

    • 51

      Scheffer Met al.2012Anticipating critical transitions. Science 338, 344–348. (doi:10.1126/science.1225244) Crossref, PubMed, ISI, Google Scholar

    • 52

      Lever JJ, van Nes EH, Scheffer M, Bascompte J. 2014The sudden collapse of pollinator communities. Ecol. Lett. 17, 350–359. (doi:10.1111/ele.12236) Crossref, PubMed, ISI, Google Scholar

    • 53

      Oyston JW, Hughes M, Wagner PJ, Gerber S, Wills MA. 2015What limits the morphological disparity of clades?Interface Focus 5, 20150042. (doi:10.1098/rsfs.2015.0042) Link, ISI, Google Scholar

    • 54

      Minelli A. 2016Species diversity vs. morphological disparity in the light of evolutionary developmental biology. Ann. Bot. 117, 781–794. (doi:10.1093/aob/mcv134) Crossref, PubMed, ISI, Google Scholar


    Page 11

    Dynamical processes taking place in complex networks are ubiquitous in natural and in technological systems [1], examples of which include disease or epidemic spreading in the human society [2,3], virus invasion in computer and mobile phone networks [4,5], behaviour propagation in online social networks [6] and air or water pollution diffusion [7,8]. Once an epidemic or environmental pollution emerges, it is often of great interest to be able to identify its source within the network accurately and quickly so that proper control strategies can be devised to contain or even to eliminate the spreading process. In general, various types of spreading dynamics can be regarded as diffusion processes in complex networks, and it is of fundamental interest to be able to locate the sources of diffusion. A straightforward, brute-force search for the sources requires accessibility of global information about the dynamical states of the network. However, for large networks, a practical challenge is that our ability to obtain and process global information can often be quite limited, making brute-force search impractical with undesired or even disastrous consequences. For example, the standard breadth-first search algorithm for finding the shortest paths, when being implemented in online social networks, can induce information explosion even for a small number of searching steps [9]. Recently, in order to locate the source of the outbreak of Ebola virus in Africa, five medical practitioners lost their lives [10]. All these call for the development of efficient methodologies to locate diffusion sources based only on limited, practically available information without the need of acquiring global information about the dynamical states of the entire network.

    There were pioneering efforts in addressing the source localization problem in complex networks, such as those based on the maximum-likelihood estimation [11], belief propagation [12], the phenomena of hidden geometry of contagion [13] and inverse spreading [14,15]. In addition, some approaches have been developed for identifying super spreaders that promote spreading processes stemming from sources [16–18]. In spite of these efforts, achieving accurate source localization from a small number of measurements remains challenging. Prior to our work, a systematic framework dealing with the localization of diffusion sources for arbitrary network structures and interaction strength was missing.

    In this paper, we develop a theoretical framework to address the problem of network source localization in a detailed and comprehensive way. The main focus is on the fundamental issue of locatability, i.e. given a complex network and limited (sparse) observation, are diffusion sources locatable? A practical and extremely challenging issue is, given a network, can a minimum set of nodes be identified which produce sufficient observation so that sources at arbitrary locations in the network can actually be located? To address these issues in a systematic manner, we use a two-step solution strategy. First, we develop a minimum output analysis to identify the minimum number of messenger/sensor nodes, denoted as Nm, to fully locate any number of sources in an efficient way. The ratio of Nm to the network size N, nm≡Nm/N, thus characterizes the source locatability of the network in the sense that networks requiring smaller values of nm are deemed to have a stronger locatability of sources. Our success in offering the minimum output analysis stems from taking advantage of the dual relation between the recently developed controllability theory [19] and the canonical observability theory [20]. Second, given Nm messenger nodes, we formulate the source localization problem as a sparse signal reconstruction problem, which can be solved by using compressive sensing (CS) [21,22], a convex optimization paradigm. The basic properties of CS allow us to accurately locate sources from a small amount of measurement from the messenger nodes, much less than that required in the conventional observability theory. We use our framework to examine a variety of model and real-world networks, and offer analytical prediction of nm and demonstrate good agreement with numerical calculations. We find that the connection density and degree distribution play a significant role in source locatability, and sources in a homogeneous and denser network are more readily to be located, which differs from existing algorithms for source localization in the literature [11,14,15]. A striking and counter-intuitive finding is that, for an undirected network with one connected component and random link weights, a single messenger node is sufficient to locate any number of sources in the presence of weak noise.

    Theoretically, the combination of the minimum output analysis (derived from the controllability and observability theories for complex networks) and the CS-based localization method constitutes a general framework for locating diffusion sources in complex networks. It represents a powerful paradigm to exactly quantify the source locatability of a network and to actually locate the sources efficiently and accurately. Because of the CS-based methodology, our framework is robust against noise [23,24], paving way to practical implementation in noise environment.

    We consider a class of diffusive processes on networks, described by

    xi(t+1)=xi(t)+β∑j=1N[wijxj(t)−wjixi(t)].2.1

    This equation constitutes a good approximation for different types of linear diffusion processes and the linearization of some nonlinear diffusion processes [25]. For example, epidemics can be treated as linear dynamics in the early stages if the network connectivity is high. Variable xi(t) that denotes the state of node i at time t captures the fraction of infected individuals, the concentration of water or air pollutant, etc., at place i. β is the diffusion coefficient, wij (wji) is the weight of the directed link from node j to node i (i to j), (wij=wji for undirected networks), and N is the number of nodes in the network (size). It is noteworthy that the value of the diffusion parameter β should be constrained to ensure the physical meaning of xi(t), i.e. xi(t) is confined in the range [0,1] at any time t for any node. We can prove that the confinement of xi(t) leads to β∈(0,mini=1,2,…,N(1/∑j=1, j≠iNwji)] (see electronic supplemental material, S1 for the proof). Equation (2.1) is discrete in time, greatly facilitating computation and analysis. When observations are made from a subset of nodes, the messenger nodes, system (2.1) incorporating outputs from these nodes can be written concisely as

    {x(t+1)=(I+βL)x(t),y(t)=Cx(t),2.2

    where x(t)∈RN is the state vector of the entire network at time t, I∈RN×N is the identity matrix, L=(W−D) is a Laplacian matrix, W∈RN×N is the weighted adjacency matrix of elements wij, D∈RN×N is a diagonal matrix of elements di denoting the total out-weight ∑j∈Γiwji of node i, where Γi is the neighbouring set of i. The vector y(t)∈Rq is the output at time t and C∈Rq×N is the output matrix. Messenger nodes are specified through matrix C and y(t) records the states of these nodes. The source localization problem is illustrated in figure 1, which is a kind of inverse problem for diffusion and spreading dynamics on complex networks.
    What happens when a red blood cell dies?

    Figure 1. Illustration of source localization problem. (a) A random network with two sources at the initial time t=0. (b–d) The diffusion process at t=1 (b), t=2 (c) and t=5 (d), respectively. The colour bar represents the state of node xi(t), and those links along which diffusion occurred are marked with red. Panels (a) to (d) describe a diffusion (spreading) process from two sources to the whole network according to equation (2.1). (e–g) Five messenger nodes whose states at three time constants can be measured and collected. The messenger nodes are specified by the output matrix C and the states of messenger nodes and inaccessible nodes constitute y(t). The time of (e), (f) and (g) corresponds to (b), (c) and (d), respectively. However, in the real situation, the time as well as the initial time is unknown. The only available information for locating sources is the states of a set of messenger nodes at some time and the network structure. (e), (f) and (g) to (a) describe the source localization problem to be solved. Moreover, we aim to identify a minimum set of messenger nodes to locate an arbitrary number of sources at any location by virtue of our minimum output analysis and optimization based on compressive sensing.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The basic difference between source nodes and other nodes in the network is that initially (t=t0), the states of the former are non-zero while those of the latter are zero. To achieve accurate localization of an arbitrary number of sources at arbitrary locations, it is only necessary to recover the initial states of all nodes from the measurements of the messenger nodes at a later time (t>t0). A solution to this problem can be obtained using the observability condition in canonical control theory. To be specific, we consider instants of time: t0,t1,…,t, and perform a simple iterative process that yields the relation between x(t) and x(t0): x(t)=[I+βL]t−t0x(t0). Consequently, the output, which depends on x(t0), can be expressed as y(t)=C(I+βL)t−t0x(t0). The key to accurate localization of sources lies in the existence of a unique solution of the equation, given the output vector y(t) from the set of messenger nodes as specified by C. Intuitively, to obtain a unique solution, no fewer than N snapshots of measurement are needed. Without loss of generality, we assume that uninterrupted time series from t0 to t0+N−1 are available. We obtain

    Y=O⋅x(t0),2.3

    where Y∈RqN, the initial state vector is x(t0)∈RN, q is the number of messenger nodes, and the matrix O∈RqN×N is nothing but the observability matrix in the canonical control theory (see §5.1 for details of equation (2.3)). The observability full rank condition [26] stipulates that, if and only if rank(O)=N, there exists a unique solution of equation (2.3) and the state vector x(t0) at initial time t0 is observable. Insofar as the given output matrix C satisfies the observability rank condition, the initial states of the nodes can be fully reconstructed from the states of the messenger nodes, and all sources can then be located. A challenge is that, in a realistic situation, the initial time t0 is often unknown, rendering the immediate application of the canonical observability condition invalid. However, a unique and desired feature of our framework is that both x(t0) and t0 can be inferred based on CS (see §§3 and 5.2). Thus, it is possible to develop a theoretical framework on the basis of the observability condition (see electronic supplementary material, S2 for continuous-time processes).

    Beyond the canonical observability theory, here our goal is to identify a minimum set of messenger nodes to satisfy the full rank condition for observability. However, the brute-force method of enumerating all possible choices of the messenger nodes is computationally prohibitive [27], as the total number of possible configurations is 2N. Our solution is to use the recently developed, exact controllability framework [19] based on the standard Popov–Belevitch–Hautus (PBH) test theory [28] and to exploit the dual relationship between controllability and observability [20], which results in a practical framework to find the required Nm messenger nodes. In particular, for an arbitrary network, according to the PBH test and the exact controllability framework, Nm is determined by the maximum geometric multiplicity of the eigenvalues λi of the matrix I+βL. After some matrix calculation, we obtain that (see electronic supplementary material, S3)

    Nm=maxi{N−rank[λiLI−L]},2.4

    where λiL is the eigenvalue of matrix L and μ(λiL)≡N−rank[λiLI−L] is the geometric multiplicity of λiL. It is worth noting that the formula of Nm does not contain the diffusion parameter β, indicating that choices of β do not affect the locatability measure nm. Equation (2.4) as a result of the standard PBH test is a general minimum output analysis for arbitrary networks.

    For an undirected network, L is symmetric and the geometric multiplicity is nothing but the eigenvalue degeneracy. In addition, the eigenvalue degeneracy of L is equal to that of I+βL (see electronic supplementary material, S3). Thus, Nm is determined by the maximum eigenvalue degeneracy of L as

    Nmundirect=maxi{δ(λiL)},2.5

    where δ(λiL) is the degeneracy of λiL (the number of appearances of λiL in the eigenvalue spectrum). Equation (2.5) based on the PBH test is our minimum output analysis for arbitrary undirected networks.

    Equations (2.4) and (2.5) are the exact theory (ET) for minimum output Nm without any approximations, but the associated computational cost resulting from calculating the eigenvalues and identifying maximum value through a large number of comparisons in equations (2.4) and (2.5) is generally high. Taking advantage of the ubiquitous sparsity of real networks [29], we can obtain an alternative method to estimate Nm with much higher efficiency. In particular, for sparse networks, we have (see electronic supplementary material, S4)

    nmsparse≈1−rank(aI−L)N,2.6

    where for undirected networks, a is either zero or the diagonal element with the maximum multiplicity (number of appearances in the diagonal) of matrix L. The matrix rank as well as eigenvalues in formula (2.6) can be computed using fast algorithms from computational linear algebra, such as SVD with the computation complexity O(N3) [30] or LU decomposition with the computation complexity O(N2.376) [31]. In general, equation (2.6) allows us to compute nm efficiently, thereby the term fast estimation (FE) method.

    We first apply our minimum output analysis to undirected Erdös–Rényi (ER) random [32] and scale-free (SF) [33] networks and derive analytical results. Figure 2 shows that, as the average degree 〈k〉 (⟨k⟩≡(1/N)∑iNki, where ki is the node degree of i) is increased, nm decreases for undirected ER random networks with identical and random link weights. For the random networks, the efficient formula (2.6) can be further simplified. In particular, for small values of 〈k〉, due to the isolated nodes and the disconnected components, zero dominates the eigenvalue spectrum of the matrix L [34] where, for example, each disconnected component generates at least one zero eigenvalue in L. For large values of 〈k〉, we expect all eigenvalues to be distinct without any dominant one. In this case, we can still choose zero to be the eigenvalue associated with a in equation (2.6). Taken together, in a wide range of 〈k〉 values, the efficient formula equation (2.6) holds with a=0. Alternatively, the value of nm for ER networks can be theoretically estimated using the degree distribution because of the dominance of the null eigenvalue (see electronic supplementary material, S4)

    nmUER≈{1−⟨k⟩/2⟨k⟩∈[0,1]1⟨k⟩(f(⟨k⟩)−f(⟨k⟩)2/2)⟨k⟩∈(1,∞),2.7

    where f(⟨k⟩)=∑k=1∞(kk−1/k!)(⟨k⟩e−⟨k⟩)k.
    What happens when a red blood cell dies?

    Figure 2. Locatability measure nm for ER and SF networks. (a–b) For undirected networks, source locatability measure nm as a function of the connecting probability 〈k〉/N for (a) unweighted ER networks and (b) weighted ER networks. (c–d) nm as a function of the average degree 〈k〉 for (c) unweighted SF networks, and Nm as a function of the average degree 〈k〉 for (d) weighted SF networks. For undirected networks, the values of nm are obtained from the exact theory (ET; equation (2.5)), fast estimation (FE; equation (2.6)), and analytical prediction (Analytical), for different network sizes. The analytical prediction for ER networks is based on equation (2.7). For SF networks in (c), the prediction is from the cavity method. (e–h) For directed networks, source locatability measures nm as a function of the connecting probability 2〈k〉/N for (e) unweighted and (f) weighted ER networks, and as a function of 〈k〉 for (g) unweighted and (h) weighted SF networks. For directed networks, the ET results come from equation (2.4), while the FE results for ER and SF networks are from equation (2.6). The analytical predictions for ER and SF networks are from equations (2.8) and (2.9), respectively. For weighted networks, link weights are randomly selected from a uniform distribution in the range (0,2), which leads to that the mean weight is approximately one. The ET and FE results are obtained by averaging over 50 independent realizations, and the error bars represent the standard deviations. For undirected ER networks, 〈k〉 = Npcon, where pcon is the connecting probability between each pair of nodes. Thus, pcon=〈k〉/N. For directed ER networks, 〈k〉 = Npcon/2, yielding pcon=2〈k〉/N.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    For undirected SF networks, a in the efficient formula (2.6) is the diagonal element with the maximum number of appearances in the diagonal of matrix L. In the controllability framework, the density of the driver nodes can be calculated [34,35] with the cavity method [36]. The principle can be extended to analysing locatability measure of SF networks in a similar manner (see electronic supplementary material, S5). The analytical estimation for both ER and SF networks is in good agreement with the results of ET and FE, as shown in figures 2a–d. Indeed, the results indicate that choosing a=0 in the efficient formula (2.6) is justified for the ER networks. For small values of 〈k〉, zero dominates the eigenvalue spectrum, and there are a number of messenger nodes with nm>1/N. When 〈k〉 exceeds a certain value, all eigenvalues become distinct, which accounts for the result of a single driver node with nm=1/N. This relation holds as 〈k〉 is increased further.

    We also find that random link weights have little effect on nm for ER networks (e.g. comparing figure 2a with figure 2b), due to the fact that an ER network tends to have many isolated components. By contrast, for SF networks, random link weights can induce a dramatic difference from the case of identical link weights, as shown in figure 2c with figure 2d. Particularly, a single messenger node is sufficient to locate sources for random link weights with weak noise, regardless of the values of 〈k〉 and N. This phenomenon can be explained based on equation (2.5), where random link weights can be regarded as imposing perturbation to the eigenvalues of the relevant unweighted Laplacian matrix (the locations of non-zero elements in the two matrices are the same). If the network has a single component, the unweighted Laplacian matrix has only one zero eigenvalue in the spectrum. The random link weights will shift the non-zero eigenvalues in the spectrum, making the probability of finding two or more identical eigenvalues effectively zero. We then expect to find one null eigenvalue and N−1 distinct non-zero eigenvalues so that the entire spectrum contains eigenvalues that are all distinct. As a result, according to equation (2.5), we have Nm=1 for the undirected, single-component SF network with random link weights. A generalization is that, for an arbitrary undirected network with random link weights and multiple components, the value of Nm is exclusively determined by the number of components, Nc, i.e. Nm=Nc, due to the fact that each component contributes a null eigenvalue. Consequently, the maximum eigenvalue degeneracy that determines Nm is equal to the number of components, Nc.

    We now turn to directed ER and SF networks. For unidirectional links in such a network, the average degree of the network is 〈k〉=〈kout〉/2=〈kin〉/2, where kout and kin denote the out-degree and in-degree, respectively. For directed ER networks, the FE formula is equation (2.6) with a=0. Analytical prediction of nm can be obtained based on the FE (see electronic supplementary material, S4)

    nmDER≈e−⟨k⟩+⟨k⟩2 e−2⟨k⟩4.2.8

    For directed SF networks, the FE formula is still equation (2.6) with a=0, −1 or −2 (see electronic supplementary material, S4). The quantity nm can be theoretically predicted via (see electronic supplementary material, S4)

    nmDSF≈∑k=1N−12−kP(k),2.9

    where k is node degree and P(k)=P(kin+kout) is the degree distribution. Figure 4e–h shows, for directed ER and SF networks, the results of nm from FE and analytical prediction agree well with those from ET without any approximations.

    It is noteworthy that for directed networks with random link weights, Nm is not determined by the number of components, Nc, because there can be more than one zero in the eigenvalue spectrum of a component, a situation that differs from that for undirected networks. In particular, for a directed network, the matrix L can have any number of zero diagonal elements because any node without outgoing links corresponds to such a diagonal element. According to the minimum output analysis, there can then be any number of messenger nodes in a component. As a result, in contrast with undirected networks with random weights, the quantity Nm in directed networks with random link weights should be calculated by using either equation (2.4) or equation (2.6) for sparse networks, not by counting the number of disconnected components.

    We also investigate the source locatability nm for a number of empirical social and technological networks, on which diffusion or spreading processes may occur. Because of the lack of link weights in the real networks, we consider two typical scenarios, unweighted networks and random weight distribution. As shown in figure 3a, nm for an unweighted real network is always larger than or equal to that of the network with random weights, indicating that random link weights are beneficial to source localization. Another feature is that sources in the technological networks with heterogeneous degree distribution (e.g. Wiki-vote, p2p-Gnutella, PGP, Political blogs, USAir) are usually more difficult to be located than the social networks with relatively homogeneous degree distribution.

    What happens when a red blood cell dies?

    Figure 3. Source locatability of empirical networks. (a) The locatability measure nm as a function of average degree 〈k〉 for a number of real social and technological networks, on which diffusion and spreading processes may occur. (b) The locatability measure obtained by using exact theory nm(ET) (equation (2.4) or equation (2.5)) and obtained by using fast estimation nm(FE) (equation (2.6)) of real networks. Here, 〈k〉=〈kin〉/2=〈kout〉/2 for a directed network. Theoretical results of ER network (equation (2.7)) and SF network with γ = 3 (equation (2.9)) are shown as a reference. Hollow symbols represent the results of unweighted real networks and solid symbols represent the results of real networks with random link weights selected from a uniform distribution in the range (0,2). More details of the real networks can be found in electronic supplementary material, S6 and table S1.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We also test the practical feasibility of our fast estimation approaches by using the real networks. As shown in figure 3b, we obtain a good agreement between nm(ET) based on the exact locatability theory with high computational complexity and nm(FE) from the fast estimation with much higher efficiency for both unweighted and weighted real networks with random weights. These results validate our fast estimation approach as applied to real networks. (The characteristics of the real networks are described in electronic supplementary material, S6 and table S1).

    Combining the results of real and model networks, we discover that the average node degree, the degree distribution and the link weight distribution jointly determine the source locatability. In particular, sources in networks with a homogeneous degree distribution, more connections and random link weights are more readily to be located.

    We demonstrate how the Nm messenger nodes can be identified using the theory of exact observability of complex networks [19]. In particular, according to the classic PBH test theory [28] and our locatability theory, the output matrix C associated with the Nm messenger nodes satisfies the rank condition rank(λmaxI−LC)=N, where λmax is the eigenvalue with the maximum geometric multiplicity μ(λmax) of matrix L, i.e. N−rank(λmaxI−L) reaches maximum value that is nothing but Nm (see equation (2.4); electronic supplementary material, S3). Messenger nodes can be identified insofar as the output matrix C is determined. The computation complexity of our elementary transformation is O(N2(log⁡N)2) [37]. Figure 4a–j illustrates, for an undirected and a directed network, the working of our method of identifying the messengers. For each case, we first compute the eigenvalues λiL of the matrix L and find the eigenvalue λmax corresponding to μ(λmax). We then implement elementary row transformation on λmaxI−L to obtain its row canonical form that reveals a set of linearly dependent columns. The messenger nodes are nothing but the nodes corresponding to the columns that are linearly dependent on other columns. The minimum number of messenger nodes (linearly dependent columns) is exactly Nm. Note that alternative configurations of the messenger nodes are possible. For example, as shown in figure 4g, we find that columns 1 and 2, and columns 4 and 5 are linearly correlated, requiring two messengers. As a result, there are four equivalent combinations for the messenger nodes: (1, 4), (1, 5), (2, 4) and (2, 5), any of which can be chosen.

    What happens when a red blood cell dies?

    Figure 4. Identification of messengers. (a–b) Illustration of our method to identify messenger nodes for (a) a simple undirected network and (b) a simple directed network. (c–d) Eigenvalues of the undirected network in (a) and that of the directed network in (b), respectively. In (c) and (d), the eigenvalue λmax corresponding to the maximum geometric multiplicity μ(λmax) is highlighted in red. (e–f) Matrix λmaxI−L for the network in (a) and (b), respectively, where λmax is highlighted. (g–h) Row canonical form of the matrix in (e) and (f) as a result of elementary row transformations, respectively. Here, linearly dependent columns in (g) and (h) are highlighted in blue. (i–j) Messenger nodes corresponding to the linearly dependent columns in the network in (a) and (b), respectively, and output signals produced by messenger nodes. For the network in (a) and (b), the configuration of messengers is not unique as it depends on the elementary row transformation, but the number of messengers Nm is fixed and solely determined by μ(λmax).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    A result from the canonical observability theory is that, in order to fully reconstruct x(t0) from solutions of equation (2.3), at least N-step measurements from the messenger nodes are necessary. However, for our localization problem, the sources are ‘minority’ nodes in the sense that the number of sources is much smaller than the network size. In fact, the states of most nodes in the network are zero initially, indicating that the vector x(t0) is sparse with a large number of zero elements. The sparsity of x(t0) can be exploited to greatly reduce the measurement requirement. In particular, in the CS framework for sparse signal reconstruction [22,38], equation (2.3) can be solved and accurate reconstruction of x(t0) can be achieved through solutions of the following convex-optimization problem

    min∥x(t0)∥1subject to Y=O⋅x(t0),3.1

    where ∥x(t0)∥1=∑i=1N|xi(t0)| is the L1 norm of x(t0), Y∈RqM, O∈RqM×N and x(t0)∈RN.

    If O satisfies the restricted isometry property (RIP) [39], a full reconstruction of x(t0) can be guaranteed theoretically through M-step measurements via some standard optimization method, where M is much smaller than N. For realistic complex networks, the RIP may be violated, but because of the linear independence of rows in matrix O it is still feasible to reconstruct x(t0) from sparse data, where M can still be much smaller than N. Another advantage associated with the CS framework lies in its robustness against noise. Especially, to obtain the direct solution of x(t0) is not possible when there is measurement noise or measurements are not sufficient (M<N), but the CS framework overcomes these difficulties.

    A complete description of our framework to reconstruct the initial states with unknown t0 is described in §5.2. Here, we present an example of locating diffusion sources in an SF network, as shown in figure 5. For an SF network of a single connected component and random link weights, our minimum output analysis gives Nm=1, and the single messenger node can be selected arbitrarily. As shown in figure 5a for an SF network with four sources and a single messenger node. For convenience, we define data≡M/N, i.e. the ratio of the utilized amount of measurement to the amount required by the canonical observability theory. Figure 5b shows the form of Y=Ox(t0), in which the initial state vector x(t0) is to be reconstructed. Note that x(t0) is quite sparse with four non-zero elements corresponding to the four sources. Thus, x(t0) can be reconstructed by using the compressive sensing from a relatively small amount of data. Figure 5c shows, for data=0.5 and in the absence of noise, four sources and their locations as well as the initial (triggering) time t0 can be accurately inferred, even though t0 is unknown. We see that the reconstructed state x(tini−3) is the sparsest in the sense that it is sparser than all the other states before and after tini−3. This indicates that the initial time is t0=tini−3 and x(tini−3) is the initial state, in which xi(tini−3) with non-zero values correspond to sources.

    What happens when a red blood cell dies?

    Figure 5. An example of locating sources in undirected weighted SF networks. (a) Illustration of an SF network with four sources with colours representing the initial state values. One messenger node is specified as a blue square. The thickness of the links represents their weight and the sizes of the nodes indicate their degrees. (b) The form of Y=Ox(t0) and the sparse initial state vector x(t0) to be reconstructed by using compressive sensing from a relatively small amount of data. (c) Reconstructed state xi(t) of each node for t≤tini, where the initial observation time is tini (tini≥t0). Colours represent the values of xi(t) with t≤tini. (d) Reconstructed initial state xi(t0) of each node from different initial observation time tini when t0, the true triggering time, is being successfully inferred. Colours represent the reconstructed values of xi(t0). The colours have the same meanings as those in (a). The four sources are randomly selected and their xi(t0) values are larger than zero. (e) Area under a receiver operating characteristic (AUROC) as a function of t (t≤tini) for a fixed initial observation time tini. (f) AUROC versus t for different initial observation time tini and different number of sources (Ns). Network parameters are set as follows. Network size is N=50, the average degree is 〈k〉=4, and the random link weights are selected from a uniform distribution in the range (0,2). For the diffusion dynamics, we set the diffusion parameter to be β=0.05 and the initial state of sources in x(t0) is randomly selected from a uniform distribution in the range (0.1,1). To implement the source localization process, the parameters are: noise amplitude σ=0, data=0.5, and the results are obtained by averaging over 300 independent simulations.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    An alternative criterion for inferring initial time t0 is that x(t0) is non-negative but some elements in x(t0−1) are negative. The presence of negative values in x(t0−1) is because of the violation of physical process at time t0−1. Actually, the diffusion process at t0−1 does not exist, such that there is no physical solution of x(t0−1), regardless of using any methods to solve x(t0−1). A forced solution of x(t0−1) will account for unreasonable values in x(t0−1). As a result, negative values in x(t0−1), x(t0−2),… are highly possible, and offer an alternative way to the sparsity x(t) for inferring t0.

    In this manner, not only can we locate the sources but we can also infer the initial states of the source nodes. As shown in figure 5c, the reconstructed initial state values of the sources at t=t0 are in good agreement with those shown in figure 5a (see §5.2 for more details). Figure 5d shows how different initial observation time tini affects source localization. We find that, in the wide range of tini from tini=t0−10 to tini=t0+80, four sources can be precisely located from a small amount of data. Here, tini<t0 indicates that we started to observe messenger nodes prior to the occurrence of the diffusion event from the four sources, which is possible because t0 is unknown. If tini is much earlier than t0, the spreading process may not occur after M-step measurements, rendering source localization impossible using any method in principle. This accounts for the failure of our method for tini<t0−20. Also, if tini is much later than t0, computing errors and noise effect will be amplified by using the CS-based optimization, leading to the inaccuracy of source localization, e.g. tini>t0+90. These issues notwithstanding, our method is quite effective for a vast range of tini for multiple sources based on sparse data from a minimum number of messenger nodes.

    To characterize the performance of our source localization method, we use a standard index from signal processing, the area under a receiver operating characteristic (AUROC) [40,41]. In particular, AUROC=1 indicates the existence of a threshold that can entirely separate the initial states x(t0) of the sources from other nodes in the network, giving rise to perfect localization of sources (see electronic supplementary material, S7 for the detailed definition of AUROC). To give a concrete example, we set tini=t0+10. Figure 5e shows that the value of AUROC reaches unity at tini−10, namely t0, demonstrating a nearly perfect localization of sources with different number. The highest reconstruction accuracy at t=t0 corresponds to the highest sparsity of the reconstructed state at t0 in figure 5c. For t>t0, at an arbitrary time t′, the number of nodes with non-zero states will be larger than the number of sources, because of the diffusion from sources to the other nodes. Thus, one may not distinguish sources from the other nodes based on the reconstructed x(t′), accounting for the lower values of AUROC at t′ compared with that at t0. On the other hand, consider an arbitrary time t′′ with t′′<t0. At t′′, the spreading process has not occurred, and there is no causality between the states at t′′ and the observation. When we impose the reconstruction on x(t′′), we cannot obtain the true x(t′′) with all zero elements but a virtual initial state vector with certain errors when compared with x(t0). The reconstruction errors will cause more non-zero states on the basis of x(t0), inducing a denser state vector than x(t0) and therefore lower values of AUROC. The reconstruction errors also explain the fact that the value of AUROC decreases more rapidly for t<t0 than for t>t0. Figure 5f shows the statistical results of figure 5d. We see that AUROC reaches unity when the observation time tini is about 3 time steps ahead of t0, and the AUROC value is nearly unchanged as tini is further increased, which is consistent with the phenomena shown in figure 5d. (In addition, examples of locating sources in ER networks with and without measurement noise, and in SF networks with measurement noise are presented in electronic supplementary material, S8 and figures S1–S3.) Here, we choose the node number 50, i.e. no. 50, to be the messenger. We also find the different choices of messengers do not affect the result of the sources localization, see electronic supplementary material, S8 and figure S4 for the details. We also investigate effects of the network size on the sources localization, and find that the data will be smaller for a larger network size when AUROC reaches 1, see electronic supplementary material, S8 and figure S5. This is because that the initial state x(t0) is sparser when the network size is larger, for a certain AUROC, then the amount of data will be smaller by using CS methods.

    We also systematically test the performance of our locatability framework with respect to data requirement and robustness against noise. We assume that measurements are contaminated by white Gaussian noise: y^(t)=y(t)[I+N(0,σ2I)], where 0∈RN is zero vector and I∈RN×N is the identity matrix, and σ is the standard deviation. The results of AUROC as a function of data for ER and SF networks are shown in figures 6a and 6b, respectively. In the absence of noise (σ=0), even for data=0.1, high values of AUROC can be achieved, e.g. 0.9, especially for SF networks. The value of AUROC exceeds 0.95 when the amount of data is 0.3, and reaches unity for data≥0.5. The essential feature holds in the presence of noise and for arbitrary values of Ns (see electronic supplementary material, S9 and figure S6). Another finding is that, fewer sources (smaller values of Ns) require less data, due to the fact that a sparser x(t0) is induced as a result of smaller Ns and in general, the CS framework requires less data to reconstruct a sparser vector. Systematic results on noise resistance are shown in figures 6c–d, where we see that the AUROC value is nearly indistinguishable across different numbers of sources, Ns. This is different from the results in figure 6a,b, and there is almost no difference between the results from ER and SF networks. Figure 6c,d also shows that, as σ is increased from 0 to 1, the AUROC value is only slightly reduced (AUROC≈0.85 for σ=1), indicating the extraordinary robustness of our locatability framework against noise. We also study the effect of the diffusion parameter β on source localization with respect to different data amounts and values of the noise variance. We find that β has little influence on the accuracy of source localization (see electronic supplementary material, S10 and figures S7–S9).

    What happens when a red blood cell dies?

    Figure 6. Locatability performance in undirected ER and SF networks. (a–d) AUROC as a function of data for (a) weighted ER and (b) unweighted SF networks, and as a function of noise variance σ for (c) weighted ER and (d) unweighted SF networks. In (a) and (b), σ is fixed at 0. In (c) and (d), data are fixed at 0.5. Cases with different numbers of sources, Ns, are included. For a random guess, the AUROC value is 0.5. The average degree 〈k〉 is 2 and 4 for the ER and SF networks, respectively. We set β=0.1 for ER networks and β=0.05 for SF networks. The results are obtained by averaging over 500 independent simulations. The other parameters are the same as in figure 5.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    We developed a framework for locating sources of diffusion or spreading dynamics in arbitrary complex networks (directed or undirected, weighted or unweighted) based solely on sparse measurement from a minimum number of messenger nodes. The key to the general framework lies in combining the controllability theory of complex networks with the compressive sensing paradigm for sparse signal reconstruction, both being active areas of research in network science and engineering. Particularly, the minimum set of messenger nodes can be identified efficiently using the minimum output analysis based on exact controllability of complex networks and the dual relation between controllability and observability. The ratio of the minimum messenger nodes to the network size characterizes the source locatability of complex networks. We find that sources in a denser and homogeneous network are more readily to be located, which distinguishes our work from those in the literature based on alternative algorithms. A finding is that, for undirected networks with one component, random link weights and weak noise, a single messenger node is sufficient to locate sources at any locations in the network. By using the data from the minimum set of messenger nodes, an approach based on compressive sensing is offered to precisely infer the initial time, at which the diffusion process starts, and the sources with non-zero states initially. Because the initial state vector to be recovered for source localization is generically sparse, compressive sensing can be employed to locate the sources from small amounts of measurement, making our framework robust against insufficient data and noise. Practically, the highlights of our framework consist of the following three features: minimum messenger nodes, sparse data requirement and strong noise resistance, which allow the sources of dynamical processes to be identified accurately and efficiently.

    Our approach was partially inspired by the pioneering effort in connecting the conventional observability theory for canonical linear dynamical systems with the compressive sensing approach [42–44]. To our knowledge, the source locatability problem has not been tackled in such a comprehensive way prior to our work. The minimal output analysis based on the controllability and observability theory for complex networks deepens our understanding of the dynamical processes on complex networks, which finds applications, e.g. in the design and analysis of large-scale sensor networks. Incorporating compressive sensing to uncover the sources and the original time of diffusion represents an innovative approach to a practical problem of significant interest but limited by finite resources for collecting data and by measurement or background noise. The underlying principle of the framework can potentially be applied to solving other optimization problems in complex networks. While we study diffusion models on time-invariant complex networks, our general framework provides significant insights into the open problem of developing source localization methods for time-variant complex networks hosting nonlinear diffusion processes.

    The detailed form of Y=O⋅x(t0) is

    (y(t0)y(t0+1)⋮y(t0+N−1))=(CC[I+βL]⋮C[I+βL]N−1)x(t0),5.1

    where N time steps of measurements are necessary to ensure full rank of the observability matrix O. Insofar as O is of full rank, according to the canonical observability theory, there exists a unique solution of the initial states to the main localization function.

    For realistic diffusive processes on networks, the initial time t0 is usually not known a priori, making inference of the initial state x(t0) a challenging task. Taking advantage of the sparsity of the initial vector x(t0) and the underlying principle of compressive sensing, we articulate an effective method to uncover both x(t0) and t0 from limited measurements.

    Say the initial observation time is tini (tini≥t0). Considering all possible t0 ahead of tini, we need to reconstruct a series of states, i.e. x(tini), x(tini−1), ⋯ , x(t0′) to ensure that the actual t0 lies in between tini and t0′. The series of states can be reconstructed from the uninterrupted observation y(tini),…,y(tini+N−1) according to the following equations:

    (y(tini)y(tini+1)⋮y(tini+N−1))=(CC[I+βL]⋮C[I+βL]N−1)x(tini),(y(tini)y(tini+1)⋮y(tini+N−1))=(C[I+βL]C[I+βL]2⋮C[I+βL]N)x(tini−1)⋮and(y(tini)y(tini+1)⋮y(tini+N−1))=(C[I+βL]tini−t0′C[I+βL]tini−t0′+1⋮C[I+βL]tini−t0′+N−1)x(t0′).5.2

    The reconstruction process is terminated and t0 can be inferred if a sparsest state is identified, say x(t1), i.e. x(t1) is sparser than all reconstructed states at time before and after t1. Then, x(t1) is taken as the initial state with the initial time t0=t1.

    By exploiting the natural sparsity of x(t), the CS framework for sparse signal reconstruction allows us to reconstruct x(tini), x(tini−1),…,x(t0′) iteratively from a small amount of data, i.e. M-step measurements and M<N, i.e. Y∈RqM, O∈RqM×N and x(t0′)∈RN. By contrast, at least N-step measurements are required in the conventional observability theory (equation (5.2)), where M depends on the sparsity of the state vector. In general, M can be much smaller than N, insofar as the number of sources Ns is much smaller than the network size N. According to equations (3.1) and (5.2), x(tini), x(tini−1), …, x(t0′) can be reconstructed efficiently from a small amount of observation that is much smaller than that required in the conventional observability theory.

    Data can be accessed at http://sss.bnu.edu.cn/%7Ewenxuw/data%5fset.htm.

    W.-X.W. and Y.-C.L. devised the research project. Z.-L.H. and X.H. performed numerical simulations. W.-X.W., Y.-C.L., Z.-L.H. and X.H. analysed the results. W.-X.W. and Y.-C.L. wrote the paper. All authors gave final approval for publication.

    We declare we have no competing interests.

    W.-X.W. was supported by NSFC under Grant No. 71631002, 61074116, the Fundamental Research Funds for the Central Universities and Beijing Nova Programme. Y.-C.L. was supported by ARO under Grant No. W911NF-14-1-0504. Y.-C.L. would also like to acknowledge support from the Vannevar Bush Faculty Fellowship programme sponsored by the Basic Research Office of the Assistant Secretary of Defense for Research and Engineering and funded by the Office of Naval Research through Grant No. ∼N00014-16-1-2828.

    We thank Mr Zhesi Shen for valuable discussion and comments. We thank the two reviewers and the associate editor for their constructive and valuable comments and suggestions, which greatly helped us improve our work. In particular, we are very grateful for the suggestion by the associate editor about the alternative method for inferring initial time t0.

    Footnotes

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Vespignani A. 2012Modelling dynamical processes in complex socio-technical systems. Nat. Phys. 8, 32–39. (doi:10.1038/NPHYS2160) Crossref, ISI, Google Scholar

    • 2

      Neumann G, Noda T, Kawaoka Y. 2009Emergence and pandemic potential of swine-origin H1N1 influenza virus. Nature 459, 931–939. (doi:10.1038/nature08157) Crossref, PubMed, ISI, Google Scholar

    • 3

      Chin R. 2013Despite large research effort, H7N9 continues to baffle. Science 340, 414–415. (doi:10.1126/science.340.6131.414) Crossref, PubMed, ISI, Google Scholar

    • 4

      Lloyd AL, May RM. 2001How viruses spread among computers and people. Science 292, 1316–1317. (doi:10.1126/science.1061076) Crossref, PubMed, ISI, Google Scholar

    • 5

      Wang P, González MC, Hidalgo CA, Barabási AL. 2009Understanding the spreading patterns of mobile phone viruses. Science 324, 1071–1076. (doi:10.1126/science.1167053) Crossref, PubMed, ISI, Google Scholar

    • 6

      Centola D. 2010The spread of behavior in an online social network experiment. Science 329, 1194–1197. (doi:10.1126/science.1185231) Crossref, PubMed, ISI, Google Scholar

    • 7

      Pope CA, et al. 2002Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA 287, 1132–1141. (doi:10.1001/jama.287.9.1132) Crossref, PubMed, ISI, Google Scholar

    • 8

      Shao M, Tang X, Zhang Y, Li M. 2006City clusters in China: air and surface water pollution. Front. Ecol. Environ. 4, 356–361. (doi:10.1890/1540-9295(2006)004[0353:CCICAA]2.0.CO;2) Crossref, ISI, Google Scholar

    • 9

      Broder A, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A, Wiener J. 2000Graph structure in the web. Comput. Netw. 33, 309–320. (doi:10.1016/S1389-1286(00)00083-9) Crossref, ISI, Google Scholar

    • 10

      Gire SK, et al. 2014Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345, 1369–1372. (doi:10.1126/science.1259657) Crossref, PubMed, ISI, Google Scholar

    • 11

      Pinto PC, Thiran P, Vetterli M. 2012Locating the source of diffusion in large-scale networks. Phys. Rev. Lett. 109, 068702. (doi:10.1103/PhysRevLett.109.068702) Crossref, PubMed, ISI, Google Scholar

    • 12

      Altarelli F, Braunstein A, Dall’Asta L, Lage-Castellanos A, Zecchina R. 2014Bayesian inference of epidemics on networks via belief propagation. Phys. Rev. Lett. 112, 118701. (doi:10.1103/PhysRevLett.112.118701) Crossref, PubMed, ISI, Google Scholar

    • 13

      Brockmann D, Helbing D. 2013The hidden geometry of complex, network-driven contagion phenomena. Science 342, 1337–1342. (doi:10.1126/science.1245200) Crossref, PubMed, ISI, Google Scholar

    • 14

      Zhu K, Ying L. 2016Information source detection in the SIR model: a sample-path-based approach. IEEE/ACM Transactions on Networking (TON) 24, 408–421. (doi:10.1109/TNET.2014.2364972) Crossref, ISI, Google Scholar

    • 15

      Shen Z, Chao S, Wang WX, Di Z, Stanley HE. 2016Locating the source of diffusion in complex networks by time-reversal backward spreading. Phys. Rev. E 93, 032301. (doi:10.1103/PhysRevE.93.032301) Crossref, PubMed, ISI, Google Scholar

    • 16

      Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, Makse HA. 2010Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893. (doi:10.1038/nphys1746) Crossref, ISI, Google Scholar

    • 17

      Pei S, Muchnik L, Andrade JS, Zheng Z, Makse HA. 2014Searching for superspreaders of information in real-world social media. Sci. Rep. 4, 5547. (doi:10.1038/srep05547) Crossref, PubMed, ISI, Google Scholar

    • 18

      Morone F, Makse HA. 2015Influence maximization in complex networks through optimal percolation. Nature 524, 65–68. (doi:10.1038/nature14604) Crossref, PubMed, ISI, Google Scholar

    • 19

      Yuan Z, Zhao C, Di Z, Wang WX, Lai YC. 2013Exact controllability of complex networks. Nat. Commun. 4, 1. (doi:10.1038/ncomms3447) Crossref, ISI, Google Scholar

    • 20

      Kalman RE. 1959On the general theory of control systems. IRE Trans. Automat. Contr. 4, 110–110. (doi:10.1109/TAC.1959.1104873) Crossref, Google Scholar

    • 21

      Candès EJ, Romberg J, Tao T. 2006Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 489–509. (doi:10.1109/TIT.2005.862083) Crossref, ISI, Google Scholar

    • 22

      Donoho DL. 2006Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306. (doi:10.1109/TIT.2006.871582) Crossref, ISI, Google Scholar

    • 23

      Candè EJ, Wakin MB. 2008An introduction to compressive sampling. Sig. Proc. Mag. IEEE 25, 21–30. (doi:10.1109/MSP.2007.914731) Crossref, ISI, Google Scholar

    • 24

      Wang WX, Yang R, Lai YC, Kovanis V, Grebogi C. 2011Predicting catastrophes in nonlinear dynamical systems by compressive sensing. Phys. Rev. Lett. 106, 154101. (doi:10.1103/PhysRevLett.106.154101) Crossref, PubMed, ISI, Google Scholar

    • 25

      Gomez S, Diaz-Guilera A, Gomez-Gardenes J, Perez-Vicente CJ, Moreno Y, Arenas A. 2013Diffusion dynamics on multiplex networks. Phys. Rev. Lett. 110, 028701. (doi:10.1103/PhysRevLett.110.028701) Crossref, PubMed, ISI, Google Scholar

    • 26

      Kalman RE. 1963Mathematical description of linear dynamical systems. J. Soc. Indus. Appl. Math. Ser. A 1, 152–192. (doi:10.1137/0301010) Crossref, Google Scholar

    • 27

      Liu YY, Slotine JJ, Barabási AL. 2013Observability of complex systems. Proc. Natl Acad. Sci. USA 110, 2460–2465. (doi:10.1073/pnas.1215508110) Crossref, PubMed, ISI, Google Scholar

    • 28

      Hautus M. 1969Controllability and observability conditions of linear autonomous systems. Ned. Akad. Wetenschappen Proc. Ser. A 72, 443. (doi:10.1016/S1385-7258(70)80049-X) Google Scholar

    • 29

      Strogatz SH. 2001Exploring complex networks. Nature 410, 268–276. (doi:10.1038/35065725) Crossref, PubMed, ISI, Google Scholar

    • 30

      Golub GH, Van Loan CF. 2012Matrix computations, 4th edn. Baltimore, ND: JHU Press. Google Scholar

    • 31

      Cormen TH, et al. 2001Introduction to algorithms, 2nd edn. Cambridge, MA: MIT press. Google Scholar

    • 32

      Erdös P, Rényi A. 1960On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci. 5, 17–61. (doi:10.2307/1999405) Google Scholar

    • 33

      Barabási AL, Albert R. 1999Emergence of scaling in random networks. Science 286, 509–512. (doi:10.1126/science.286.5439.509) Crossref, PubMed, ISI, Google Scholar

    • 34

      Zhao C, Wang WX, Liu YY, Slotine JJ. 2015Intrinsic dynamics induce global symmetry in network controllability. Sci. Rep. 5, 1. (doi:10.1038/srep08422) ISI, Google Scholar

    • 35

      Liu YY, Slotine JJ, Barabási AL. 2011Controllability of complex networks. Nature 473, 167–173. (doi:10.1038/nature10011) Crossref, PubMed, ISI, Google Scholar

    • 36

      Mézard M, Parisi G. 2001The Bethe lattice spin glass revisited. Eur. Phys. J. B 20, 217–233. (doi:10.1007/PL00011099) Crossref, ISI, Google Scholar

    • 37

      Grcar JF. 2011How ordinary elimination became Gaussian elimination. Hist. Math. 38, 163–218. (doi:10.1016/j.hm.2010.06.003) Crossref, ISI, Google Scholar

    • 38

      Han X, Shen Z, Wang WX, Di Z. 2015Robust reconstruction of complex networks from sparse data. Phys. Rev. Lett. 114, 028701. (doi:10.1103/PhysRevLett.114.028701) Crossref, PubMed, ISI, Google Scholar

    • 39

      Candes EJ, Tao T. 2005Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215. (doi:10.1109/TIT.2005.858979) Crossref, ISI, Google Scholar

    • 40

      Fawcett T. 2006An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874. (doi:10.1016/j.patrec.2005.10.010) Crossref, ISI, Google Scholar

    • 41

      Hanley JA, McNeil BJ. 1982The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36. (doi:10.1148/radiology.143.1.7063747) Crossref, PubMed, ISI, Google Scholar

    • 42

      Tarfulea N. 2011Observability for initial value problems with sparse initial data. Appl. Comput. Harmon. Anal. 30, 423–427. (doi:10.1016/j.acha.2011.01.006) Crossref, ISI, Google Scholar

    • 43

      Dai W, Yuksel S. 2013Observability of a linear system under sparsity constraints. IEEE Trans. Automat. Contr. 58, 2372–2376. (doi:10.1109/TAC.2013.2253272) Crossref, ISI, Google Scholar

    • 44

      Sanandaji BM, Wakin MB, Vincent TL. 2014Observability with random observations. IEEE Trans. Automat. Contr. 59, 3002–3007. (doi:10.1109/TAC.2014.2351693) Crossref, ISI, Google Scholar


    Page 12

    Upright faces are thought to be processed holistically, whereby local facial features are integrated into a unified representation for the purposes of efficient analysis [1–4]. Evidence for this view comes from the composite-face effect [5–7]. When upper and lower regions from different faces are aligned to form a facial composite, the two halves appear to ‘fuse’ together, perceptually. The illusory distortion induced by task-irrelevant (‘distractor’) halves hinders participants' judgements about task-relevant (‘target’) halves (for reviews, see [8,9]). However, when composite arrangements are misaligned spatially, or turned upside-down, the illusion-induced interference is greatly diminished [10,11]. The composite-face effect reveals a tendency to integrate feature information from disparate regions of intact upright faces, consistent with theories of holistic face processing [1–4].

    Composite fusion is thought to distort the perception of face structure—a semi-permanent, durable source of facial variation that changes slowly over time [12,13]—leading to biased attributions of facial identity [7], age [14], gender [15] and attractiveness [16]. However, it is well established that manifest expressions—a transient source of facial variation [12,13]—also induce strong composite illusions that interfere with observers' attribution of facial emotion [5,17,18]. For example, observers are error prone and slow when asked to name the emotion of a target half when aligned with a distractor half exhibiting a different emotion, even when the two halves are from the same identity [5].

    In recent years, the study of the composite-face effect has been dominated by matching paradigms, whereby observers are asked to judge whether the target regions in two composite arrangements—presented simultaneously or sequentially—are identical or not (e.g. [6,11,19–22]). These procedures are popular because they can be employed with unfamiliar faces (i.e. matching procedures do not necessitate a familiarization phase) and because they allow authors to compare the composite effects seen with faces and other classes of non-face object. Matching paradigms effectively demonstrate the presence of illusory distortion, however, they reveal little about the nature of the distortion induced; the type or direction of illusory bias is ambiguous. The composite-face arrangements employed are constructed from emotionally neutral faces, where the actor depicted has been instructed to convey no emotion. Where observed, composite effects derived from these paradigms are therefore assumed to reflect the binding of facial structure [8,9].

    Crucially, however, observers frequently perceive emotion in ostensibly neutral faces, contrary to the intention of the actors themselves and experimenters [23]. Capturing valence-free facial expressions is deceptively difficult; when posing for photos, actors seeking to appear ‘neutral’ often appear anxious, bored, threatening or cheerful. In addition, it is not always easy to distinguish a stranger's permanent facial shape from their transient facial expressions [24–26]. For example, it can be difficult to determine whether an unfamiliar actor is sad or simply has a mouth that droops at its corners, whether flared nostrils are a stable facial feature or a display of frustration. Similar effects can also be induced experimentally by feature displacement. For example, simply increasing the vertical distance between the eyes and mouth can augment perceptions of sadness, while decreasing this distance makes the same face appear angry [27].

    This study sought to determine how perceived emotion cues influence the composite-face effect. Previous authors have noted that different sets of composite faces produce effect sizes that vary considerably [28,29]. To date, however, little is known about the origin of this inter-stimulus variability. Because image-matching composite paradigms simply require observers to judge whether target halves are identical or not, interference may be induced by the binding of perceived emotion, facial structure or both. Given the strength of the composite effects induced by facial emotion [5,17,18], some of the illusory distortion currently attributed to the binding of facial structure, may in fact be induced by unintended emotion cues [8,9]. Consistent with this possibility, we describe three complementary experiments which suggest that subtle emotions perceived by observers exert a striking influence on the strength of the composite effect.

    It is well established that emotional distractors impair explicit emotion judgements made about the target region (e.g. labelling or categorization), when arrangements are aligned and upright [5,17,18]. It is unclear, however, whether emotion cues present in the distractor induce ‘incidental’ composite effects; i.e. illusory distortions that affect image matching, in the absence of an explicit emotion judgement. This was the possibility we sought to test in our first experiment. Neutral target regions were presented with task-irrelevant distractor regions, either aligned or misaligned, displaying: (i) no emotion, (ii) weak emotion or (iii) strong emotion. Should distractor regions induce similar levels of illusory interference irrespective of emotion content, it would indicate that the effects obtained using image-matching paradigms reflect the binding of facial structure only. However, modulation of illusory interference by the presence of emotion would imply that emotion cues also induce incidental composite interference.

    Thirty-six naive adults completed the experiment (Mage = 20.58 years; s.d.age = 3.17; eight males). Two participants were replaced having scored 0% correct in one or more of the misaligned conditions. All participants had normal or corrected-to-normal vision. Ethical clearance was granted by the local ethics committee and the study was conducted in line with the ethical guidelines laid down in the 6th (2008) Declaration of Helsinki. All participants gave informed consent.

    In all the experiments described, upper face halves were used as target regions, and lower face halves were used as distractors (in line with the prevailing convention [9]). Target regions were taken from 18 neutral faces selected from the Radboud face database [30]. Distractor regions were selected from six different individuals sourced from the same database. Faces were cropped just above the nostrils. Three levels of emotion intensity were produced for each distractor identity, yielding 18 distractors in total. Three of the distractor identities (the happy subset) expressed no emotion, 50% happy and 100% happy. The remaining three distractor identities (the angry subset) expressed no emotion, 50% angry and 100% angry. The 50% intensities were created through image morphing completed using Morpheus Photo Morpher v. 3.11 (Morpheus Software, Indianapolis, IN). Facial composites subtended approximately 6° vertically when viewed at 58 cm. In the misaligned condition, target and distractor halves were offset horizontally by approximately 3°. A thin grey line (approx. 4 pixels) was inserted in between the target and distractor to help participants distinguish the to-be-judged regions. The absence of such delineation may artificially inflate the magnitude of composite-face effects [31].

    Each trial began with a fixation point, and then presented two composite arrangements sequentially, each for 200 ms (figure 1a). During an inter-stimulus-interval of 1000 ms, a mask was presented, constructed from high-contrast greyscale ovals. The target halves could either be identical (50% of trials) or could differ (50% of trials). Participants made simple image-matching judgements about the targets. An original matching design was employed whereby the two distractor halves always differed [6,8]. One distractor was taken from the happy set and one from the angry set (note, this meant that the identity of the distractor always differed). The allocation of happy and angry distractors to the first and second arrangements was counterbalanced. In the no emotion condition, distractor halves had 0% emotion; in the weak emotion condition, distractor halves had 50% emotion; in the strong emotion condition, distractor halves had 100% emotion. Thus, within each trial, the intensity of the expression was held constant, but the actual emotion presented in the two arrangements differed. In total, there were 216 experimental trials: 18 randomly selected target pairings × 2 target types (same, different) × 3 levels of perceived emotion (low, medium, high) × 2 alignments (aligned, misaligned). The different types of trial were randomly interleaved within four blocks of 54 trials. The experiment was programed in Matlab with Psychtoolbox extensions [32,33].

    What happens when a red blood cell dies?

    Figure 1. (a) Sequentially presented composite faces were presented in which the distractor half either had 0% emotion, 50% emotion or 100% emotion. (b) Results from Experiment 1 in the low, moderate and high-emotion conditions. *** denotes p < 0.001, ** denotes p < 0.01, n.s., non-significant. Error bars denote ±1 s.e.m.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    In the original matching design employed in Experiment 1, distractor halves always differ. The composite illusion is therefore revealed by a disproportionate accuracy cost in the ‘same’ target trials in the presence of aligned distractor regions. Crucially, the interaction between target type (same, different) and alignment (aligned, misaligned) was found to vary as a function of emotion (0%, 50%, 100%) (F2,70 = 5.50, p = 0.006, ηp2=0.14), suggesting that the emotion cues presented in the distractor halves influenced the strength of the composite illusion (figure 1b). Evidence of the composite illusion, inferred from simple target type × alignment interactions, was found in both the strong (F1,35 = 19.11, p < 0.001, ηp2=0.35) and weak emotion conditions (F1,35 = 16.85, p < 0.001, ηp2=0.33), but not in the no emotion condition (F1,35 = 1.29, p = 0.26, ηp2=0.04). Further analyses revealed that the overall target type × alignment × emotion interaction was driven by differences between the no emotion condition and both the weak (F1,35 = 7.49, p = 0.01, ηp2=0.18) and strong (F1,35 = 8.48, p = 0.006, ηp2=0.20) emotion conditions. The interactions in the weak and strong emotion conditions did not differ significantly (F1,35 = 0.33, p = 0.64, ηp2=0.01).

    In the no emotion condition, Bonferroni corrected post hoc contrasts indicated no effect of alignment for either same (t35 = 0.92, p = 0.36) or different trials (t35 = 0.62, p = 0.54). However, accuracy was greater for same trials in both the aligned (t35 = 3.68, p = 0.001) and misaligned (t35 = 5.05, p < 0.001) conditions, suggesting an underlying response bias to respond ‘same’. In the weak emotion condition, we found evidence for a composite effect: observers' accuracy on the same trials was lower when distractors were aligned, than when misaligned (t35 = 3.34, p = 0.002). This effect was reversed for the different trials (t35 = 3.14, p = 0.003). In the strong emotion condition, a classic composite effect was found: once again, observers' accuracy on the same trials was lower when distractors were aligned, than when misaligned (t35 = 4.90, p < 0.001), but this was not the case on different trials (t35 = 1.32, p = 0.20).

    These results highlight the striking influence that emotion cues exert on the strength of the composite illusion measured using image-matching paradigms. Previous reports have described how incongruous distractor emotion impairs emotion judgements made about target regions [5,17,18]. However, the present effects of distractor emotion may be thought of as ‘incidental’ insofar as emotion cues hinder image matching, not emotion labelling or categorization per se. Importantly, these results confirm that illusion-induced interference seen on image-matching composite procedures may result from the binding of face structure or the binding of facial expression.

    Clear and comparable composite illusions were seen when distractor halves depicted strong and intermediate facial emotion. This suggests that the effect is not driven purely by physical dissimilarities in the distractor regions, as the physical differences between strong and intermediate emotion were the same as between intermediate and no emotion conditions. When the distractor halves contained no emotion, however, we found no evidence of a composite effect. We speculate that the lack of a composite effect in this condition may be a product of the procedure employed. Interleaving trials with strong illusory distortion (high emotion) and moderate illusory distortion (intermediate emotion) may have altered participants' decision criteria. While some subtle distortion may be seen in the no emotion condition, it may have been insufficient to elicit ‘different’ responses where participants have the reasonable expectation that ‘same’ and ‘different’ responses should be made with roughly equal frequency within a block.

    The results of Experiment 1 confirm that emotion cues present in distractor regions may induce incidental composite interference, impairing image-matching judgements made about target face regions. Moreover, it appears that relatively weak emotion cues present in the distractor regions are sufficient to induce target distortions. In our second experiment, we sought evidence that perceived emotion in ostensibly ‘neutral’ faces might modulate composite interference in a similar way. If perceived emotion modulates composite binding, distractor halves rich in perceived emotion should exert more illusory distortion on target halves. In Experiment 2, we therefore examined the relative ability of 50 distractor halves—all supposedly ‘neutral’—to distort observers' perception of four target halves, to determine whether this variability is associated with the presence of perceived emotion. Traditional composite-face procedures collapse across multiple targets and distractors to derive a single estimate of observers' susceptibility to the illusion. To estimate the composite interference induced by individual distractors, we therefore employed a novel subjective-report paradigm.

    The emotion rating task was completed by 30 naive adults (Mage = 30.8 years; s.d.age = 8.0; nine males). A separate group of 46 nave adults (Mage = 46.3 years; s.d.age = 9.1; 16 males) participated in the composite distortion task. All participants had normal or corrected-to-normal vision. Ethical clearance was granted by the local ethics committee and the study was conducted in line with the ethical guidelines laid down in the 6th (2008) Declaration of Helsinki. All participants gave informed consent.

    The 50 distractor halves and four target halves were cropped from 54 male faces sourced from the Karolinska Directed Emotional Faces [34] and the Radboud Faces Database [30]. Importantly, each actor depicted was posing a neutral expression (i.e. trying not to convey facial emotion). External facial features were occluded using an oval frame. Faces were cropped just above the nostrils. The distractor and target halves were presented in greyscale against a mid-grey background. Once again, a thin grey line (approx. 4 pixels) was inserted in between the target and distractor to help participants distinguish the to-be-judged region. Participants in the rating phase were required to rate each of the 50 distractors for the presence of five emotions (happiness, anger, fear, sadness and disgust)1 on a 1–100 scale. Each rating trial presented a single distractor in isolation.

    Two identical target halves were presented on the left and right side of the display, separated by approximately 7° of visual angle when viewed at 58 cm. On each trial, the left-hand target was aligned with one of the 50 distractors to create a facial composite subtending approximately 6° vertically. The right-hand target was always presented in isolation. Having been told that the targets were physically identical, participants were required to report the strength of the distortion induced by the distractor using a slider (from no distortion to substantial distortion, on 1–50 scale). No time limit was imposed. Each distractor region was paired with four different eye regions, resulting in 200 subjective-report trials, completed in a randomized order. To help participants familiarize themselves with the nature and strength of the illusion, they viewed all 200 displays for 3 s each before starting the rating procedure. We hoped pre-exposure would improve participants' ability to describe the relative strength of the distortion on a given trial. The experiment was programed in Matlab with Psychtoolbox extensions [32,33].

    The subjective reports of illusory distortion induced by the distractors, provided by each participant, were first averaged across the four target halves (to derive the average distortion reported by a given participant, for each distractor), then averaged across participants (to compute the average distortion reported by the sample, for each distractor). To produce a single measure of the perceived emotion present in each distractor, we calculated its Euclidean distance in emotion space2 from the point of absolute neutrality (figure 2a). Smaller scores indicate that distractors were rated closer to neutral and therefore contained less perceived emotion. Despite being cropped from ostensibly emotion-neutral faces, there was considerable variability in the mean distances computed (figure 2b).

    What happens when a red blood cell dies?

    Figure 2. (a) To produce a single estimate of the perceived emotion present in each distractor, we computed its Euclidean distance in emotion space from the point of absolute neutrality. (b) Examples of facial composites used Experiment 2 constructed with distractors rated high (top) and low (bottom) in perceived emotion. Observers were required to rate the extent to which the lower face half distorted their percept of the upper face half. (c) The correlation between the average magnitude of composite distortion and the average distance of each distractor half from true neutral.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Simple correlation analysis (figure 2c) revealed a significant positive relationship (r = 0.38, p = 0.006) between the degree of perceived emotion (M = 24.73, s.d. = 8.77) and the composite distortion induced by the distractors (M = 16.86, s.d. = 3.60). Consistent with the view that perceived emotion in supposedly neutral faces induces incidental composite effects, distractors rated as more emotional induced stronger illusory distortion. To our knowledge, this is the first attempt to understand stimulus-specific variation in composite interference.

    The first two experiments employed complementary approaches to study the role of subtle emotion cues on the composite effect; artificially introducing an emotion signal using image morphing (Experiment 1), and using the natural variation present in the population of ‘neutral faces’ available in commonly used face databases (Experiment 2). Nevertheless, the results are convergent; relatively subtle emotion cues, either intended or unintended, can exert a striking influence on the strength of target distortions induced by the composite illusion.

    The results from Experiment 2 suggest a relationship between the emotion ratings awarded to ostensibly neutral distractors and the degree of composite distortion induced. However, describing one's subjective experience of an unfamiliar illusion is challenging [37]. In Experiment 3, we therefore sought to determine whether variation in perceived emotion present in ostensibly neutral faces also modulates performance on a sequential matching composite procedure. In Experiment 1, we employed the original matching design, whereby the distractor regions always differ [8]. However, in Experiment 3, we employed a congruency procedure that also included trials where distractors were the same. Some authors have speculated that this design measures composite-face effects in a way that attenuates the influence of response bias [38] (for a different view see [8]). For the sake of clarity, we provide supplementary analyses of those trials where the distractors differ (the original design). We note, however, that these results suggest a similar conclusion to those obtained with the full congruency design. An inverted control condition was also employed to confirm that the effects of alignment are orientation sensitive [10].

    Twenty naive adults (Mage = 27.2; s.d.age = 4.7; five males) with normal or corrected-to-normal vision participated in Experiment 3. Ethical clearance was granted by the local ethics committee and the study was conducted in line with the ethical guidelines laid down in the 6th (2008) Declaration of Helsinki. All participants gave informed consent.

    Eighteen distractor halves were selected from the 50 used in Experiment 2. The nine judged closest to true neutral were selected for use in the low perceived emotion condition, and the nine judged furthest from true neutral were selected for use in the high perceived emotion condition. Once more, we note that all 18 were cropped from supposedly ‘neutral’ faces. Eighteen target halves were sourced from the Karolinska [34] and Radboud databases [30], including the four used in Experiment 2. Facial composites subtended approximately 6° vertically when viewed at 58 cm. A thin grey line (approx. 4 pixels) was inserted in between the target and distractor to guide participants' judgements.

    On each trial, observers were asked to indicate whether the target halves of two sequentially presented composites were the same or different, in the presence of distractor halves that were either identical or different, and either high or low in perceived emotion. Two control manipulations were employed; an inverted condition, where both composites were presented upside-down, and a misaligned condition, where target and distractor halves were offset horizontally by approximately 3° (figure 3a). In total, there were 576 experimental trials: 18 target combinations × 2 target types (same, different) × 2 distractor types (same, different) × 2 levels of perceived emotion (high, low) × 2 orientations (upright, inverted) × 2 alignments (aligned, misaligned). All trial types were randomly interleaved. The experiment lasted 35 min and was separated into 10 blocks. The experiment was programed in Matlab with Psychtoolbox extensions [32,33].

    What happens when a red blood cell dies?

    Figure 3. (a) Illustration of each condition; upright aligned, upright misaligned, inverted aligned and inverted misaligned. Each trial began with a central fixation cross. The first target face was then presented for 500 ms, followed by a mask for 500 ms. The second face was visible until a response was registered. Observers responded ‘same’ or ‘different’ using the keyboard. Observers were instructed to make their judgement on the upper face half (i.e. the eye region) irrespective of composite orientation. (b) Trial types for the complete composite design used in Experiment 3.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Presenting every possible target-distractor pairing in each of the different conditions would have necessitated a prohibitive number of experimental trials. While everyone judged the same 18 target combinations on the same trials (i.e. all 18 target halves), each participant judged a different set of 18 target combinations on the different trials. Distractor halves were assigned pseudo-randomly. Where different distractors were employed on a trial, they were chosen from the same emotion condition (half high perceived emotion; half low perceived emotion). For 50% of the same trials, the targets were paired with the same distractor (congruent-same trials); for the remaining same trials, targets were paired with different distractors (incongruent-same trials). For 50% of the different trials, targets were paired with the same distractor (incongruent-different trials); for the remaining different trials, targets were paired with different distractors (congruent-different trials).

    Target halves (Tsame, Tdifferent) and distractor halves (Dsame, Ddifferent) were combined in a complete factorial design, yielding four possible trial types (figure 3b): congruent-same (Tsame, Dsame), incongruent-same (Tsame, Ddifferent), congruent different (Tdifferent, Ddifferent), and incongruent-different (Tdifferent, Dsame). On congruent trials, composite effects are thought to aid observers' performance (identical distractors facilitate ‘same’ decisions about identical targets; different distractors facilitate ‘different’ decisions about non-identical targets). On incongruent trials, composite effects are thought to impair observers' performance (identical distractors hinder ‘different’ decisions about non-identical targets; different distractors hinder ‘same’ decisions about identical targets). In this congruency design, composite effects are therefore indexed by a disproportionate effect of congruency (congruent, incongruent) when composites are upright and aligned, relative to inverted or misaligned conditions (e.g. [21,39]). For each cell in the design (emotion × orientation × alignment), we therefore estimated observers' discrimination sensitivity on congruent and incongruent trials through the calculation of d' statistics [40].

    Significant composite effects, indicated by characteristic congruency × orientation × alignment interactions, were seen in both the high (F1,19 = 22.23, p < 0.001, ηp2=0.54) and low (F1,19 = 4.53, p = 0.047, ηp2=0.19) perceived emotion conditions (table 1 and figure 4). Critically, however, composite effects were larger when distractors contained high levels of perceived emotion (F1,19 = 4.56, p = 0.046, ηp2=0.41). This interaction with emotion was driven by sensitivity differences in the upright conditions, indicated by a significant emotion × congruency × alignment interaction (F1,19 = 6.25, p = 0.02, ηp2=0.25). When composites were upright and aligned there was also an emotion × congruency interaction (F1,19 = 9.30, p = 0.007, ηp2=0.33). Importantly, none of the interactions with emotion reached significance when composites were inverted or misaligned (all F's < 2.6; all p's > 1.2). When the composites were presented upright and aligned, effects of congruency were observed in both the high (t19 = 6.0, p < 0.001) and low (t19 = 2.98, p = 0.008) perceived emotion conditions. Observers' sensitivity differed significantly for the high and low emotion distractors on the congruent trials (t19 = 3.08, p = 0.006), but not on the incongruent trials, (t19 = 1.29, p = 0.21).

    What happens when a red blood cell dies?

    Figure 4. Results from Experiment 3 in the (a) high, and (b) low perceived emotion conditions. *** denotes p < 0.001, ** denotes p < 0.01, * denotes p < 0.05, n.s., non-significant. Error bars denote ± 1 s.e.m.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 1.Results of the ANOVAs performed on the high and low perceived emotion conditions in Experiment 3. (Values in bold indicate significant values.)

    high perceived emotionlow perceived emotion
    Fpηp2Fpηp2
    congruency2.510.1300.124.290.0520.18
    orientation14.280.0010.4310.690.0040.36
    alignment0.090.7670.013.930.0620.17
    congruency × orientation12.860.0020.401.230.2820.06
    congruency × alignment5.800.0260.230.150.7100.01
    orientation × alignment0.0780.7840.010.090.7700.01
    congruency × orientation × alignment22.23<0.0010.544.530.0470.19
    upright
     congruency9.450.0060.335.150.0350.21
     alignment0.1500.7030.012.510.1300.12
     congruency × alignment16.910.0010.472.120.1620.10
    inverted
     congruency2.720.1200.130.630.4380.03
     alignment0.0040.9500.001.530.2300.07
     congruency × alignment2.070.1700.103.440.0790.15

    The findings from these complementary experiments indicate that subtle facial emotion cues exert a striking influence on the strength of the composite-face effect. In our first experiment, we found that composite interference grew stronger as the strength of the emotion signal present in the distractor increased. Critically, effects of distractor emotion were induced by relatively weak cues (only 50% of the full emotion intensity), and were incidental insofar as emotion cues hindered image matching, not emotion labelling or categorization per se. Next, we examined whether perceived emotion cues present in ostensibly neutral faces, are strong enough to modulate composite interference in a similar way. We found a correlation between the strength of perceived emotion cues (rated by one set of participants) and the strength of illusory distortion induced (assessed by different participants) in a set of 50 ‘neutral’ distractors taken from commonly used face databases. In Experiment 3, we compared the composite effects induced by ostensibly neutral distractors rated high and low for perceived emotion, measured using a sequential matching task. We found significantly larger composite effects were induced by the emotion-rich distractors; strikingly, the characteristic interaction effect was more than twice as strong in the high perceived emotion condition.

    Different learning traditions have converged on the principle that the degree of covariation between stimulus elements determines whether they will be grouped together [41,42]. Crucially, facial emotions are known to comprise highly correlated feature changes [43]. Exposure to this covariation may therefore provide a strong basis for inter-feature perceptual prediction [44,45], and underlie the compelling composite distortion induced by facial emotion [5,17,18]. The identity [7], age [14] and gender [15] composite effects may have a similar origin; for example, in our day-to-day environment, the presence of a male mouth reliably predicts the presence of male eyes. We speculate that the strength of illusory distortion induced by different composite arrangements may be determined by the strength of these cross-feature contingencies.3 Subtle expression cues may exert a strong influence on the composite-face effect because of the striking statistical regularities seen in facial expressions [43]. We note that composite effects have recently been reported with expressive body postures [46], but not for neutral body shapes [47]. The highly coordinated nature of whole body actions may also underlie the composite effects seen in this domain.

    The present results suggest that composite effects measured in sequential image-matching paradigms probably reflect illusory interference induced by both expression and structure cues. We are not in a position to determine whether these sources of distortion interact or combine additively. Insofar as facial structure and facial expression are largely independent sources of facial variation [48,49], perceptual predictions derived from structure and expression cues may also be relatively independent [5]. Nevertheless, illusory distortion induced by expression cues may hinder the matching of targets based on facial identity. Observers experience well documented difficulties encoding the facial structure of unfamiliar faces [50–52]. For example, when asked to sort photographs of two unfamiliar individuals according to the identity of those depicted, observers perform poorly, frequently attributing the photographs to eight or more different individuals [53]. Deriving an expression-invariant description of unfamiliar faces poses a particular challenge; when viewing a single image, it is often impossible to determine whether a stranger is scowling or has narrow eyes. In light of the difficulties partitioning facial variance according to structure and expression, expression distortions may affect identity matching for unfamiliar faces.

    Previous research has revealed that perceived emotion can exert a strong influence on the judgements we make about the character traits of others. For example, the detection of anger and happiness may be responsible for trait judgements of dominance and trustworthiness inferred spontaneously from supposedly neutral faces [26,54]. Consistent with this view, observers who have difficulties interpreting facial emotion, make unusual trait judgements about neutral faces [55]. The current findings further illustrate the unexpected effects that unintended emotion cues may exert on the perception of ‘emotionless’ faces. Such cues may not only influence the judgement of character traits but may modulate the extent to which faces are processed holistically. Interestingly, the present findings suggest the possibility that highly trustworthy and highly dominant faces may tend to produce large composite effects, insofar as both may be rich in perceived emotion cues (see also [56]).

    We have argued that subtle emotion cues present in distractor regions exert a striking influence on the strength of composite-face effects, possibly because of the strength of the inter-feature contingencies present in manifest facial expressions. However, some readers might query whether emotion cues modulate composite effects via another route. If the presence of emotion cues made the distractor regions more salient, they may have impaired matching through generic distraction, rather than distortion induced by the composite-face illusion. Two of our findings speak against this alternative account. First, generic distraction effects should be relatively insensitive to the alignment manipulation. Crucially, however, we only saw effects of emotion when distractor regions were aligned; the presence of emotion had little effect when distractors were misaligned. Second, effects of emotion were seen when participants were asked to rate the strength of the illusory distortion without any time pressure (Experiment 2). Distraction effects might conceivably impair sequential matching ability where arrangements are presented very briefly. In Experiment 2, however, participants could take as long as they wished to compare the target aligned with the distractor, and the target presented in isolation.

    A further possibility that warrants discussion is the suggestion that the increased strength of the composite illusion was not attributable to facial emotion per se. Instead, some distractor regions with unusual or distinctive facial structure were perhaps more likely to be perceived as emotional; for example, ambiguous face shapes may be more receptive to a high-emotion perceptual interpretation. Thus, apparent modulation by facial emotion may have been driven by underlying facial structure variation. Again, however, features of our data speak against this view. First, in Experiment 1 we found that increasing the strength of the emotion signal present on the same facial identities can increase the strength of the composite illusion. In this situation, there is little possibility that perceived emotion is confounded with facial structure. This finding confirms that effects of emotion can be seen independently of facial structure. Second, it is evident from Experiment 2 that perceived emotion cues are present in a great many ‘neutral’ distractor regions sourced from popular face databases. It seems unlikely that all of these faces are unusual or distinctive. Rather, it appears that posing expressions which are truly emotion neutral may be a formidable challenge for actors of all face shapes.

    The present results have important implications for researchers using the composite-face paradigm to investigate holistic processing in typical and atypical populations. Previous studies comparing individuals' susceptibility to the composite illusion and other markers of holistic processing, notably the part-whole effect [57], have yielded inconsistent findings [58–60]. The relationship between observers' susceptibility to the composite-face effect and their face recognition ability also remains uncertain[22,58–62]. These mixed results have cast doubt on the functional significance of holistic face processing as measured by the composite paradigm [9]. Crucially, however, many widely used stimulus sets contain composites rich in facial emotion (figure 5). While these sets may yield strong replicable composite effects, individual differences may be less likely to correlate with susceptibility to the part-whole effect and measures of face recognition ability. Instead, the present results raise the possibility that individual differences in illusion susceptibility may sometimes correlate with measures of expression recognition.

    What happens when a red blood cell dies?

    Figure 5. Examples of facial composites taken from a popular stimulus set developed by Le Grand and co-workers [20]. This set has been widely used to investigate holistic processing in typical and atypical populations [17,22,63–66]. While composites are constructed with ostensibly neutral faces, subtle emotion cues are present in many of the arrangements. These unintended emotion cues, together with the absence of a gap between the target and distractor regions [31], may contribute to the large effect sizes seen with this set.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    The present results also have implications for the study of holistic face processing in atypical populations. For example, some authors have found that observers with autism spectrum disorder (ASD) exhibit broadly typical composite-face effects [63], whereas other findings indicate abnormal processing of facial composites [67]. Importantly, however, there is considerable heterogeneity within the ASD population in terms of expression recognition [68,69]. This variability may help explain the inconsistent performance of ASD samples on composite-face tasks. Similarly, cases of developmental prosopagnosia (DP) have been described who exhibit typical composite effects despite severe face recognition difficulties [64,70]. However: (i) it is known that many DPs exhibit good expression recognition [71,72], and (ii) the composite stimuli used in these studies include salient emotion cues. It is unclear, therefore, whether these individuals exhibit intact holistic face processing per se, or intact holistic processing of facial emotion.

    Finally, several authors have sought to investigate the origin and specificity of the composite effect by comparing the strength of illusory interference induced by faces and other types of object [73]. However, the strength of the face composite effect will probably depend on the degree of perceived emotion present in the arrangement. When contrasting the size of composite effects induced by the binding of face shape with those seen for rigid non-face objects-of-expertise, such as ‘Greebles’ [74], authors should seek to exclude perceived emotion cues from their face arrangements; i.e. to ensure binding is based solely on the covariation of structure cues in the stimulus classes compared. We speculate that animating to-be-learned items with coordinated patterns of global change—mirroring the correlated dynamics of whole body actions and facial expressions—may increase the strength of composite interference seen with non-face objects-of-expertise.

    The results from the three experiments described indicate that perceived emotion cues modulate the strength of the composite-face effect when stimulus arrangements are constructed from supposedly ‘neutral’ faces. These results have important implications for research addressing holistic processing in typical and atypical populations. Understanding the contribution of perceived emotion to inter-stimulus variability may help reveal the relationship between composite interference, other markers of holistic face processing, and face recognition ability.

    Each study was granted ethical clearance by the local ethics committee (University of Reading, for Experiments 1 and 3; City, University London, for Experiment 2) and were conducted in line with the ethical guidelines laid down in the 6th (2008) Declaration of Helsinki. All participants gave informed consent.

    The dataset supporting this article are available as the electronic supplementary material.

    K.L.H.G. and R.C. conceived and designed the studies. J.M., J.E.M. and K.L.H.G. collected the data. K.L.H.G., J.M., J.E.M. and R.C. conducted the statistical analyses, and K.L.H.G. and R.C. drafted the manuscript. All authors gave final approval for publication.

    We have no competing interests.

    K.L.H.G. was supported by an Experimental Psychology Society award. J.M. is supported by a doctoral studentship funded by the Economic and Social Research Council (ESRC).

    We thank Esther Franke and Raul Ungureanu for assistance with data collection.

    Footnotes

    1 Surprise was not included as an option in the rating task, as the status of surprise as a basic emotion has been questioned (e.g. [35]). There is also some evidence that the perceptual representation of surprise behaves differently to that of other facial emotions; e.g. surprise expressions may not be perceived categorically [36].

    2 The use of Euclidean distances assumes orthogonal dimensionality. When judging whole-face expressions, ratings of anger and disgust are known to correlate. Some sign of this association was also seen in the half-face ratings collected in Experiment 2; however, the correlation was not strong (r = 0.32).

    3 Upper face halves are commonly used as distractor regions in composite paradigms insofar as mouth-to-eye interference is typically far stronger than eye-to-mouth interference. Interestingly, this asymmetry potentially accords with a feature covariation account. Predictions made about the eye region based on the state of the mouth region may be more reliable than the predictions made about the mouth region based on the state of the eye region. By way of comparison, when English readers encounter the letter ‘Q’ in a sentence, there is a strong likelihood that the next letter will be ‘U’. Conversely, when the letter ‘U’ is encountered, it is less likely that the preceding letter will be ‘Q’.

    Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.3738158.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Farah MJ, Wilson KD, Drain M, Tanaka JN. 1998What is ‘special’ about face perception?Psychol. Rev. 105, 482–498. (doi:10.1037/0033-295X.105.3.482) Crossref, PubMed, ISI, Google Scholar

    • 2

      Maurer D, Le Grand R, Mondloch CJ. 2002The many faces of configural processing. Trends Cogn. Sci. 6, 255–260. (doi:10.1016/S1364-6613(02)01903-4) Crossref, PubMed, ISI, Google Scholar

    • 3

      Piepers DW, Robbins RA. 2013A review and clarification of the terms ‘holistic’, ‘configural’, and ‘relational’ in the face perception literature. Front. Psychol. 3, 1–11. ISI, Google Scholar

    • 4

      McKone E, Yovel G. 2009Why does picture-plane inversion sometimes dissociate the perception of features and spacing in faces, and sometimes not? Toward a new theory of holistic processing. Psychon. Bull. Rev. 16, 778–797. (doi:10.3758/PBR.16.5.778) Crossref, PubMed, ISI, Google Scholar

    • 5

      Calder AJ, Young AW, Keane J, Dean M. 2000Configural information in facial expression perception. J. Exp. Psychol. Hum. Percept. Perform. 26, 527–551. (doi:10.1037/0096-1523.26.2.527) Crossref, PubMed, ISI, Google Scholar

    • 6

      Hole G. 1994Configurational factors in the perception of unfamiliar faces. Perception 23, 65–74. (doi:10.1068/p230065) Crossref, PubMed, ISI, Google Scholar

    • 7

      Young AW, Hellawell D, Hay DC. 1987Configurational information in face perception. Perception 16, 747–759. (doi:10.1068/p160747) Crossref, PubMed, ISI, Google Scholar

    • 8

      Rossion B. 2013The composite face illusion: a whole window into our understanding of holistic face perception. Vis. Cogn. 21, 139–253. (doi:10.1080/13506285.2013.772929) Crossref, ISI, Google Scholar

    • 9

      Murphy J, Gray KLH, Cook R. 2016The composite face illusion. Psychon. Bull. Rev. (doi:10.3758/s13423-016-1131-5) ISI, Google Scholar

    • 10

      McKone Eet al.2013Importance of the inverted control in measuring holistic face processing with the composite effect and part-whole effect. Front. Psychol. 4, 1–21. (doi:10.3389/fpsyg.2013.00033) Crossref, PubMed, ISI, Google Scholar

    • 11

      Susilo T, Rezlescu C, Duchaine B. 2013The composite effect for inverted faces is reliable at large sample sizes and requires the basic face configuration. J. Vis. 13, 1–9. Crossref, ISI, Google Scholar

    • 12

      Bruce V, Young AW. 1986Understanding face recognition. Br. J. Psychol. 77, 305–327. (doi:10.1111/j.2044-8295.1986.tb02199.x) Crossref, PubMed, ISI, Google Scholar

    • 13

      Haxby JV, Hoffman EA, Gobbini MI. 2000The distributed human neural system for face perception. Trends Cogn. Sci. 4, 223–233. (doi:10.1016/S1364-6613(00)01482-0) Crossref, PubMed, ISI, Google Scholar

    • 14

      Hole G, George P. 2011Evidence for holistic processing of facial age. Vis. Cogn. 19, 585–615. (doi:10.1080/13506285.2011.562076) Crossref, ISI, Google Scholar

    • 15

      Baudouin JY, Humphreys GW. 2006Configural information in gender categorisation. Perception 35, 531–540. (doi:10.1068/p3403) Crossref, PubMed, ISI, Google Scholar

    • 16

      Abbas ZA, Duchaine B. 2008The role of holistic processing in judgments of facial attractiveness. Perception 37, 1187–1196. (doi:10.1068/p5984) Crossref, PubMed, ISI, Google Scholar

    • 17

      Palermo R, Willis ML, Rivolta D, McKone E, Wilson CE, Calder AJ. 2011Impaired holistic coding of facial expression and facial identity in congenital prosopagnosia. Neuropsychologia 49, 1226–1235. (doi:10.1016/j.neuropsychologia.2011.02.021) Crossref, PubMed, ISI, Google Scholar

    • 18

      Tanaka JW, Kaiser MD, Butler S, Le Grand R. 2012Mixed emotions: holistic and analytic perception of facial expressions. Cogn. Emot. 26, 961–977. (doi:10.1080/02699931.2011.630933) Crossref, PubMed, ISI, Google Scholar

    • 19

      Goffaux V, Rossion B. 2006Faces are ‘spatial’–holistic face perception is supported by low spatial frequencies. J. Exp. Psychol. Hum. Percept. Perform. 32, 1023–1039. (doi:10.1037/0096-1523.32.4.1023) Crossref, PubMed, ISI, Google Scholar

    • 20

      Le Grand R, Mondloch CJ, Maurer D, Brent HP. 2004Impairment in holistic face processing following early visual deprivation. Psychol. Sci. 15, 762–768. (doi:10.1111/j.0956-7976.2004.00753.x) Crossref, PubMed, ISI, Google Scholar

    • 21

      Richler JJ, Cheung OS, Gauthier I. 2011Holistic processing predicts face recognition. Psychol. Sci. 22, 464–471. (doi:10.1177/0956797611401753) Crossref, PubMed, ISI, Google Scholar

    • 22

      Konar Y, Bennett PJ, Sekuler AB. 2010Holistic processing is not correlated with face-identification accuracy. Psychol. Sci. 21, 38–43. (doi:10.1177/0956797609356508) Crossref, PubMed, ISI, Google Scholar

    • 23

      Lee E, Kang JI, Park IH, Kim JJ, An SK. 2008Is a neutral face really evaluated as being emotionally neutral?Psychiatry Res. 157, 77–85. (doi:10.1016/j.psychres.2007.02.005) Crossref, PubMed, ISI, Google Scholar

    • 24

      Oosterhof NN, Todorov A. 2009Shared perceptual basis of emotional expressions and trustworthiness impressions from faces. Emotion 9, 128–133. (doi:10.1037/a0014520) Crossref, PubMed, ISI, Google Scholar

    • 25

      Said CP, Sebe N, Todorov A. 2009Structural resemblance to emotional expressions predicts evaluation of emotionally neutral faces. Emotion 9, 260–264. (doi:10.1037/a0014681) Crossref, PubMed, ISI, Google Scholar

    • 26

      Todorov A, Said CP, Engell AD, Oosterhof NN. 2008Understanding evaluation of faces on social dimensions. Trends Cogn. Sci. 12, 455–460. (doi:10.1016/j.tics.2008.10.001) Crossref, PubMed, ISI, Google Scholar

    • 27

      Neth D, Martinez AM. 2009Emotion perception in emotionless face images suggests a norm-based representation. J. Vis. 9, 1–11. Crossref, PubMed, ISI, Google Scholar

    • 28

      Ross DA, Richler JJ, Gauthier I. 2015Reliability of composite-task measurements of holistic face processing. Behav. Res. Methods 47, 736–743. (doi:10.3758/s13428-014-0497-4) Crossref, PubMed, ISI, Google Scholar

    • 29

      Richler JJ, Gauthier I. 2014A meta-analysis and review of holistic face processing. Psychol. Bull. 140, 1281–1302. (doi:10.1037/a0037004) Crossref, PubMed, ISI, Google Scholar

    • 30

      Langner O, Dotsch R, Bijlstra G, Wigboldus DHJ, Hawk ST, van Knippenberg A. 2010Presentation and validation of the Radboud Faces Database. Cogn. Emot. 24, 1377–1388. (doi:10.1080/02699930903485076) Crossref, ISI, Google Scholar

    • 31

      Rossion B, Retter TL. 2015Holistic face perception: mind the gap!Vis. Cogn. 23, 379–398. (doi:10.1080/13506285.2014.1001472) Crossref, ISI, Google Scholar

    • 32

      Brainard DH. 1997The psychophysics toolbox. Spat. Vis. 10, 433–436. (doi:10.1163/156856897X00357) Crossref, PubMed, Google Scholar

    • 33

      Pelli DG. 1997The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442. (doi:10.1163/156856897X00366) Crossref, PubMed, Google Scholar

    • 34

      Lundqvist D, Flykt A, Öhman A. 1998The Karolinska Directed Emotional Faces—KDEF. CD ROM from Department of Clinical Neuroscience, Psychology section, Karolinska Institutet, Solna, Sweden. Google Scholar

    • 35

      Oatley K, Johnson-laird PN. 1986Towards a cognitive theory of emotions. Cogn. Emot. 1, 29–50. (doi:10.1080/02699938708408362) Crossref, Google Scholar

    • 36

      Etcoff NL, Magee JJ. 1992Categorical perception of facial expressions. Cognition 44, 227–240. (doi:10.1016/0010-0277(92)90002-Y) Crossref, PubMed, ISI, Google Scholar

    • 37

      Schooler J, Schreiber CA. 2004Experience, meta-consciousness, and the paradox of introspection. J. Conscious. Stud. 11, 17–39. ISI, Google Scholar

    • 38

      Richler JJ, Cheung OS, Gauthier I. 2011Beliefs alter holistic face processing … if response bias is not taken into account. J. Vis. 11, 1–13. (doi:10.1167/11.13.17) Crossref, ISI, Google Scholar

    • 39

      Richler JJ, Gauthier I, Wenger MJ, Palmeri TJ. 2008Holistic processing of faces: perceptual and decisional components. J. Exp. Psychol. Learn. Mem. Cogn. 34, 328–342. (doi:10.1037/0278-7393.34.2.328) Crossref, PubMed, ISI, Google Scholar

    • 40

      Macmillan NA, Creelman CD. 1991Detection theory: a user's guide. New York, NY: Cambridge University Press. Google Scholar

    • 41

      Aslin RN, Newport EL. 2012Statistical learning from acquiring specific items to forming general rules. Curr. Dir. Psychol. Sci. 21, 170–176. (doi:10.1177/0963721412436806) Crossref, PubMed, ISI, Google Scholar

    • 42

      Pearce JM, Bouton ME. 2001Theories of associative learning in animals. Annu. Rev. Psychol. 52, 111–139. (doi:10.1146/annurev.psych.52.1.111) Crossref, PubMed, ISI, Google Scholar

    • 43

      Jack RE, Garrod OGB, Schyns PG. 2014Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Curr. Biol. 24, 187–192. (doi:10.1016/j.cub.2013.11.064) Crossref, PubMed, ISI, Google Scholar

    • 44

      Cook R, Aichelburg C, Johnston A. 2015Illusory feature slowing: evidence for perceptual models of global facial change. Psychol. Sci. 26, 512–517. (doi:10.1177/0956797614567340) Crossref, PubMed, ISI, Google Scholar

    • 45

      Johnston A. 2011Is dynamic face perception primary? InDynamic faces: insights from experiments and computation (eds Curio C, Giese M, Bulthoff HH). Cambridge, MA: MIT Press. Google Scholar

    • 46

      Willems S, Vrancken L, Germeys F, Verfaillie K. 2014Holistic processing of human body postures: evidence from the composite effect. Front. Psychol. 5, 618. (doi:10.3389/fpsyg.2014.00618) Crossref, PubMed, ISI, Google Scholar

    • 47

      Soria-Bauser DA, Suchan B, Daum I. 2011Differences between perception of human faces and body shapes: evidence from the composite illusion. Vis. Res. 51, 195–202. (doi:10.1016/j.visres.2010.11.007) Crossref, PubMed, ISI, Google Scholar

    • 48

      Calder AJ, Burton AM, Miller P, Young AW, Akamatsu S. 2001A principal component analysis of facial expressions. Vis. Res. 41, 1179–1208. (doi:10.1016/S0042-6989(01)00002-5) Crossref, PubMed, ISI, Google Scholar

    • 49

      Calder AJ, Young AW. 2005Understanding the recognition of facial identity and facial expression. Nat. Rev. Neurosci. 6, 641–651. (doi:10.1038/nrn1724) Crossref, PubMed, ISI, Google Scholar

    • 50

      Hancock PJB, Bruce V, Burton AM. 2000Recognition of unfamiliar faces. Trends Cogn. Sci. 4, 330–337. (doi:10.1016/S1364-6613(00)01519-9) Crossref, PubMed, ISI, Google Scholar

    • 51

      Jenkins R, Burton AM. 2011Stable face representations. Phil. Trans. R. Soc. B 366, 1671–1683. (doi:10.1098/rstb.2010.0379) Link, ISI, Google Scholar

    • 52

      Murphy J, Ipser A, Gaigg SB, Cook R. 2015Exemplar variance supports robust learning of facial identity. J. Exp. Psychol. Hum. Percept. Perform. 41, 577–581. (doi:10.1037/xhp0000049) Crossref, PubMed, ISI, Google Scholar

    • 53

      Jenkins R, White D, Van Monfort X, Burton AM. 2011Variability in photos of the same face. Cognition 121, 313–323. (doi:10.1016/j.cognition.2011.08.001) Crossref, PubMed, ISI, Google Scholar

    • 54

      Oosterhof NN, Todorov A. 2008The functional basis of face evaluation. Proc. Natl Acad. Sci. USA 105, 11 087–11 092. (doi:10.1073/pnas.0805664105) Crossref, ISI, Google Scholar

    • 55

      Brewer R, Collins F, Cook R, Bird G. 2015Atypical trait inferences from facial cues in alexithymia. Emotion 15, 637–643. (doi:10.1037/emo0000066) Crossref, PubMed, ISI, Google Scholar

    • 56

      Todorov A, Loehr V, Oosterhof NN. 2010The obligatory nature of holistic processing of faces in social judgments. Perception 39, 514–532. (doi:10.1068/p6501) Crossref, PubMed, ISI, Google Scholar

    • 57

      Tanaka JW, Farah MJ. 1993Parts and wholes in face recognition. Q. J. Exp. Psychol. 46, 225–245. (doi:10.1080/14640749308401045) Crossref, Google Scholar

    • 58

      DeGutis J, Wilmer J, Mercado RJ, Cohan S. 2013Using regression to measure holistic face processing reveals a strong link with face recognition ability. Cognition 126, 87–100. (doi:10.1016/j.cognition.2012.09.004) Crossref, PubMed, ISI, Google Scholar

    • 59

      Wang R, Li J, Fang H, Tian M, Liu J. 2012Individual differences in holistic processing predict face recognition ability. Psychol. Sci. 23, 169–177. (doi:10.1177/0956797611420575) Crossref, PubMed, ISI, Google Scholar

    • 60

      Rezlescu C, Susilo T, Wilmer JB, Caramazza A. In press. The inversion, part-whole, and composite effects reflect distinct perceptual mechanisms with varied relationship to face recognition. J. Exp. Psychol. Hum. Percept. Perform. (doi:10.1037/xhp0000400) Google Scholar

    • 61

      Avidan G, Tanzer M, Behrmann M. 2011Impaired holistic processing in congenital prosopagnosia. Neuropsychologia 49, 2541–2552. (doi:10.1016/j.neuropsychologia.2011.05.002) Crossref, PubMed, ISI, Google Scholar

    • 62

      Finzi RD, Susilo T, Barton JJS, Duchaine B. 2017The role of holistic face processing in acquired prosopagnosia: evidence from the composite face effect. Vis. Cogn. 24, 304–320. Crossref, ISI, Google Scholar

    • 63

      Nishimura M, Rutherford MD, Maurer D. 2008Converging evidence of configural processing of faces in high-functioning adults with autism spectrum disorders. Vis. Cogn. 16, 859–891. (doi:10.1080/13506280701538514) Crossref, ISI, Google Scholar

    • 64

      Le Grand R, Cooper PA, Mondloch CJ, Lewis TL, Sagiv N, de Gelder B, Maurer D. 2006What aspects of face processing are impaired in developmental prosopagnosia?Brain Cogn. 61, 139–158. (doi:10.1016/j.bandc.2005.11.005) Crossref, PubMed, ISI, Google Scholar

    • 65

      Mondloch CJ, Pathman T, Maurer D, Le Grand R, de Schonen S. 2007The composite face effect in six-year-old children: evidence of adult-like holistic face processing. Vis. Cogn. 15, 564–577. (doi:10.1080/13506280600859383) Crossref, ISI, Google Scholar

    • 66

      Schmalzl L, Palermo R, Coltheart M. 2008Cognitive heterogeneity in genetically based prosopagnosia: a family study. J. Neuropsychol. 2, 99–117. (doi:10.1348/174866407X256554) Crossref, PubMed, ISI, Google Scholar

    • 67

      Gauthier I, Klaiman C, Schultz RT. 2009Face composite effects reveal abnormal face processing in autism spectrum disorders. Vis. Res. 49, 470–478. (doi:10.1016/j.visres.2008.12.007) Crossref, PubMed, ISI, Google Scholar

    • 68

      Bird G, Cook R. 2013Mixed emotions: the contribution of alexithymia to the emotional symptoms of autism. Transl. Psychiatry 3, e285. (doi:10.1038/tp.2013.61) Crossref, PubMed, ISI, Google Scholar

    • 69

      Cook R, Brewer R, Shah P, Bird G. 2013Alexithymia, not autism, predicts poor recognition of emotional facial expressions. Psychol. Sci. 24, 723–732. (doi:10.1177/0956797612463582) Crossref, PubMed, ISI, Google Scholar

    • 70

      Susilo Tet al.2010Face recognition impairments despite normal holistic processing and face space coding: evidence from a case of developmental prosopagnosia. Cogn. Neuropsychol. 27, 636–664. (doi:10.1080/02643294.2011.613372) Crossref, PubMed, ISI, Google Scholar

    • 71

      Duchaine BC, Parker H, Nakayama K. 2003Normal recognition of emotion in a prosopagnosic. Perception 32, 827–838. (doi:10.1068/p5067) Crossref, PubMed, ISI, Google Scholar

    • 72

      Biotti F, Cook R. 2016Impaired perception of facial emotion in developmental prosopagnosia. Cortex 81, 126–136. (doi:10.1016/j.cortex.2016.04.008) Crossref, PubMed, ISI, Google Scholar

    • 73

      Richler JJ, Wong YK, Gauthier I. 2011Perceptual expertise as a shift from strategic interference to automatic holistic processing. Curr. Dir. Psychol. Sci. 20, 129–134. (doi:10.1177/0963721411402472) Crossref, PubMed, ISI, Google Scholar

    • 74

      Gauthier I, Williams P, Tarr MJ, Tanaka JW. 1998Training ‘greeble’ experts: a framework for studying expert object recognition processes. Vis. Res. 38, 2401–2428. (doi:10.1016/S0042-6989(97)00442-2) Crossref, PubMed, ISI, Google Scholar


    Page 13

    Adolescence describes the period of transition between childhood and adulthood when individuals undergo considerable psychological and physical change [1]. In particular, social cognition and behaviour change dramatically in this period, underpinned by the rapid development of the ‘social brain’, the network of brain areas involved in social information processing (e.g. [2]). Because of this, social relationships become increasingly salient in adolescence, particularly with regard to gaining approval and avoiding rejection from peers [3,4]. Additionally, during adolescence there is rapid development of the brain's dopaminergic system, which processes rewarding stimuli [5,6]. Together, the neural changes in social and reward processing networks mean that adolescents may find social interactions particularly motivating and influential, which can lead to more risky behaviour in the presence of peers (e.g. [7]). Understanding social reward processing in adolescents is therefore critical for understanding social behaviour and well-being in this age group.

    Experimental evidence has documented the reward value of social stimuli in adolescence. For example, studies have found that adolescents’ performance on cognitive tasks are improved with positive social feedback such as smiling faces [8] or ‘thumbs up’ gestures [9], and such stimuli are subjectively rated as likeable [9]. Other research has indicated that socially rewarding stimuli may actually be more salient for adolescents than they are for adults. For example, distracting smiling faces impaired performance in a working memory task for adolescents (aged 12–14) but not adults, indicating that smiling faces may be especially salient for the adolescents [10].

    Despite the theoretical and empirical data that indicate that social stimuli and interactions are an important source of reward for adolescents, to our knowledge, no research has attempted to empirically identify and categorize the different types of social interactions that adolescents find rewarding. Some researchers only discuss that social relationships, in general, become more rewarding in adolescence (e.g. [5]). Others evaluate a specific type of social reward, such as the presence of peers [11] or smiling faces [8,10]. However, as yet, there has not been a comprehensive assessment of the full range of social experiences that are rewarding for adolescents. There is also no existing measure, to our knowledge, that assesses individual differences in the reward value of social experiences for use in adolescents.

    In adults, the social reward questionnaire (SRQ [12]) is a valid and reliable measure of individual differences in the value of different social rewards. This previous study used exploratory and confirmatory factor analyses (EFA and CFA) to identify six types of social reward: enjoyment of Admiration, Negative Social Potency, Passivity, Prosocial Interactions, Sexual Relationships and Sociability [12]. Each type of social reward equates to a subscale in the SRQ. These six subscales were not selected a priori, but were the result of the model that best fit the initial data set of 75 items. After the initial EFA supported this six-factor model, the best items in each factor were selected (those that loaded most strongly and unambiguously onto each factor) and the other items were discarded. This refined item set (23 items, 3–5 items for each of the 6 factors) were then subjected to a CFA. This CFA confirmed that a six-factor model fit the data well. The names of the subscales were chosen to reflect the content of the items within each factor (see [12] for more detail on the development of the adult SRQ).

    The subscales in the adult SRQ cover the following domains of social reward: Sexual Relationships, the enjoyment of sexual intimacy; Admiration, the enjoyment of being flattered and gaining positive attention; Negative Social Potency, the enjoyment of being cruel, antagonistic and using others; Passivity, the enjoyment of giving others control and allowing them to make decisions; Prosocial Interactions, the enjoyment of having kind and reciprocal relationships; and Sociability, the enjoyment of engaging in group interactions. Each subscale has good psychometric properties and also showed a unique pattern of associations with external measures, providing support for the meaning of each subscale [12]. The primary aim of the present study is to modify the adult SRQ so that it can be used to assess individual differences in social reward value in adolescents.

    The adult SRQ can be used as a tool to explore the experience of social reward in individuals who display problematic or unusual social behaviour [13,14]. For example, psychopathic traits—problematic personality traits including a lack of empathy and antisocial behaviour [15]—have been associated with an atypical pattern of social reward [13]. Specifically, adults with high levels of psychopathic traits display an ‘inverted’ pattern of social reward, in which they report that being cruel towards others is enjoyable and being kind is not. This is in stark contrast to the majority of typical individuals, who find cruelty aversive and experience affiliative interactions and relationships as a fundamental source of reward (e.g. [16]). These findings are potentially important when trying to understand the mechanisms behind the atypical social behaviour seen in psychopathy, i.e. the high levels of antisocial behaviour and low levels of affiliative, prosocial behaviour.

    Psychopathic-type traits such as a lack of empathy and guilt can be detected in children and adolescents, and are termed callous–unemotional (CU) traits in this age group [17]. Young people with high levels of CU traits are at an elevated risk of having high levels of psychopathic traits when they become adults, and so CU traits are considered to be antecedents to adult psychopathy [18]. Like adults with psychopathic traits, adolescents with high levels of CU traits display problematic social behaviour. For example, compared with adolescents with low levels of CU traits, those with high levels of CU traits tend to endorse more antisocial solutions to achieve their goals, such as using aggression [19], and are more likely to bully others [20]. Unsurprisingly, their friendships tend to be shorter than those of typical adolescents [21]. It is important to understand possible mechanisms behind the problematic social behaviour seen in these adolescents, and one such mechanism is atypical social reward.

    There is some limited existing evidence that adolescents with high levels of CU traits may have atypical processing of typically rewarding social stimuli such as happy faces [22,23]. In one study, adolescents with high levels of CU traits were less distracted by irrelevant happy faces compared with typically developing controls [23]. Other research has demonstrated that children with high levels of CU traits spend less time looking at their mothers’ faces, irrespective of the mothers’ behaviour [22]. Together, this research presents an interesting possibility that typically socially rewarding stimuli (such as happy faces) may have less reward value in adolescents with high levels of CU traits. However, social reward has not yet been systematically examined in relation to CU traits. It is important to understand the possible association between CU traits and social reward value in adolescents, as this may increase understanding of the mechanisms behind their callous and antisocial behaviour towards others.

    Our primary aim in the current study was to validate the Social Reward Questionnaire—Adolescent Version (SRQ-A), an adapted version of the adult SRQ, in a sample of typical 11–16 year olds. We reasonably expect that the types of social interactions that are rewarding for adults (as shown in the adult SRQ) will also be important sources of social reward for adolescents. Specifically, receiving the approval of others (as measured by SRQ subscale Admiration) and engaging in meaningful reciprocal relationships (as measured by SRQ Prosocial Interactions) are likely to be particularly rewarding in adolescence, when peer approval and intimate friendships take on increasing importance [24,25]. Similarly, we expect the enjoyment of engaging in group interactions (as measured by SRQ Sociability) to be important at a time when individuals start spending more time with friends and reporting more positive affect from doing so [26]. We expect the enjoyment of being cruel, antagonistic and using others (as measured by SRQ Negative Social Potency) will also be relevant in adolescence, since some individuals in this age group report this reward as a motivation for bullying (e.g. [27]). We had no explicit expectation that the enjoyment of giving others control and allowing them to make decisions (as measured by SRQ Passivity) would be an important social reward in adolescence, but kept these items for the adolescent questionnaire as an exploratory hypothesis.

    Two brief measures of personality traits were included for the purpose of construct validity: the Ten-Item Personality Inventory (TIPI [28]), a measure of five-factor model (FFM) personality traits, and the CU subscale of the Antisocial Process Screening Device (APSD [29]). We hypothesized that we would find similar associations in the current study to those found in our earlier studies with the adult SRQ [12,13]. For example, we hypothesized that the personality trait extraversion would be positively associated with enjoyment of admiration, and that openness to experience would be positively associated with enjoyment of sociability [12,13]. We also hypothesized that CU traits would show a pattern of ‘inverted’ social reward, in which adolescents with high levels of these traits report more enjoyment of negative social potency and less enjoyment of prosocial interactions, in line with our previous findings from adults with high levels of psychopathic traits [12,13].

    Data were collected from two state secondary schools in Greater London: one in South London (n = 382) and one in East London (n = 196). Ten participants had more than 20% of the SRQ-A data missing, indicating that the questionnaire had not been answered carefully. These participants were removed from all further analyses, leaving a final sample of n = 568. Participants were 11–16 years old (mean = 12.89, s.d. = 1.18; n = 19 did not disclose age). The sample was 50.0% male (n = 284) and 47.4% female (n = 269); 2.7% (n = 15) of the sample did not disclose gender. Ethnicity data were not collected due to constraints imposed by the schools.

    Both schools have a similar proportion of pupils claiming Free School Meal Entitlement, a useful proxy measure for pupil socioeconomic disadvantage (South London school: 17.10%; East London school: 17.70%; national average in England: 13.46%).

    Ethical approval covered data collection in schools.

    Participants completed the questionnaires by hand during their morning registration period. Participants completed questionnaires in their class group, but were sat separately to ensure their responses were private. Taking part took approximately 10 min and participants were not compensated. Data were entered into an SPSS (v. 20) database by two researchers; 10% of entries were cross-checked for accuracy.

    The items in the SRQ-A were taken from the adult SRQ, with some items removed or modified to ensure the content was appropriate for use with 11–16 year olds. These decisions were made based on discussions with a panel of six researchers with expertise in adolescent development. First, the Sexual Relationships subscale was removed in its entirety as its content is inappropriate for young adolescents. (This subscale consisted of three items: I enjoy having erotic relationships; I enjoy having many sexual experiences; I enjoy having an active sex life.) Therefore, the SRQ-A consisted of five subscales: Admiration, the enjoyment of being flattered and gaining positive attention; Negative Social Potency, the enjoyment of being cruel, antagonistic and using others; Passivity, the enjoyment of giving others control and allowing them to make decisions; Prosocial Interactions, the enjoyment of having kind and reciprocal relationships; and Sociability, the enjoyment of engaging in group interactions.

    In addition, the wording of two items was simplified: ‘I enjoy achieving recognition from others’ was changed to ‘I enjoy getting praise from others’ and ‘I enjoy feeling emotionally connected to someone’ was changed to ‘I enjoy feeling emotionally close to someone’. The final questionnaire has a Flesch–Kincaid Reading Grade Level of 6.641 [30], indicating that the wording should be understood by pupils in Grade 6 (USA) and above, i.e. those aged 11 years and older.

    In addition to the SRQ-A, participants completed the following questionnaires for the purposes of construct validity and to assess associations between the SRQ-A and CU traits. Brief measures were chosen due to constraints on testing time imposed by the schools.

    The TIPI [28] is a 10-item scale that measures the ‘Big Five’ personality traits (agreeableness, conscientiousness, extraversion, neuroticism and openness to experience; e.g. [31]). All items begin with ‘I see myself as’ and are followed by two descriptive items such as ‘anxious, easily upset’. Responses are given on a 1–7 scale (1, disagree strongly, 7 = agree strongly). The TIPI was originally validated in an adult sample, but has since been used with adolescents (e.g. [32,33]). We had several hypotheses: SRQ-A Prosocial Interactions would be positively associated with agreeableness and conscientiousness; SRQ-A Negative Social Potency would be negatively correlated with these traits; and SRQ-A Sociability would be positively correlated with extraversion.

    The CU [29] subscale is a 6-item measure, with each item scored from 0 to 2 (0 = not at all true, 1 = sometimes true, 2 = definitely true). This subscale measures CU traits, with items such as ‘You are concerned about the feelings of others’ and ‘You feel bad or guilty when you do something wrong’ (both reverse-coded). The self-report version of the APSD used here has good psychometric properties (e.g. [34]; although see [35]). Note that due to time constraints and the nature of our hypotheses, only the CU subscale was administered. We hypothesized that CU traits would be positively associated with SRQ-A Negative Social Potency and negatively associated with SRQ-A Prosocial Interactions, in line with findings with an adult measure of psychopathic traits [13].

    Before any analyses were conducted, 10 participants were removed for having between 20 and 100% of SRQ data missing (mean = 41.50%, s.d. = 0.27%), as this indicated that the questionnaire had not been answered carefully. For all remaining analyses, all participants were retained, including those with less than 20% missing SRQ-A data. Participants with one or more items of missing questionnaire data (SRQ-A, TIPI or APSD; n = 106) did not differ from those without (n = 462) on gender (χ(1,n=553)2=1.36, p = 0.24; n = 15 did not disclose gender) or age (t547 = −0.247, p = 0.80; n = 19 did not disclose age). Specific strategies for dealing with missing data are described in the following sections.

    To assess the latent structure of the social reward item set in adolescents, CFA was conducted on the 20-item SRQ-A using Mplus v. 7.1 [36]. The sample size (n = 568) was adequate for testing a model consisting of 50 parameters (i.e. 20 factor loadings, 20 error variances and 10 factor correlations). Specifically, the subjects-to-parameters ratio for the 20-item model is approximately 11 : 1, which is higher than the 10 : 1 minimum ratio recommended by Bentler and Chou [37].

    We used the mean and variance-adjusted weighted least squares (WLSMV) estimation procedure as recommended for analysis of ordinal data [36]. Our intention was to assess whether the item set from adolescents showed the same factor structure (minus the Sexual Relationships factor) as that from adults [12]. The default in Mplus is to estimate latent models using all available data, including cases that have some missing values for some variables. Therefore, all available data were used for the CFA. The proportion of missing values for the current study was examined by a covariance coverage matrix, which provides an estimate of available observations for each pair of variables. The percentage of data present for each pair of variables ranged from 98 to 100%, indicating that the amount of missing data was minimal.

    As recommended by Hu & Bentler [38], we used a two-index strategy to assess model fit: the incremental Comparative Fit Index (CFI) and the Root Mean Square Error of Approximation (RMSEA), an absolute fit index. We adopted the traditional CFI of 0.90 or above and RMSEA of 0.08 or below [39] as indicative of acceptable model fit. Our rationale is based on the fact that as model complexity increases, so does the difficulty of achieving conventional levels of model fit [40]. Thus, we chose to use conventional criteria to avoid falsely rejecting a viable latent variable model; we believed the use of conventional criteria was reasonable.

    Cronbach α values and mean inter-item correlations (MICs) were measured to assess the internal consistency of each SRQ-A subscale. Given the limitations of Cronbach alpha and that it is not an indicator of scale unidimensionality [41], we relied more on the scale MICs to assess item homogeneity and internal consistency.

    Using SPSS (v. 20), Pearson correlational analyses were conducted to assess associations between SRQ-A subscales and measures of personality (TIPI and CU subscale of the APSD). Pairwise correlations were calculated to maximize the use of available data. Benjamini and Hochberg false discovery rate [42] was used to control for the probability of making a type I error on multiple comparisons.

    In order to measure the stability of responses over time, a subset of participants from Sample 2 completed the SRQ twice, exactly one week apart. Pairwise Pearson correlational analyses were conducted to assess associations between subscale scores at the two time points, and Benjamini and Hochberg false discovery rate [42] was used to control for the probability of making a type I error on multiple comparisons.

    Participants were from two different schools. We conducted a supplementary CFA that took into account the clustered nature of the data, in order to assess model fit while taking into account the degree of non-independence across cases.

    We assessed two types of invariance: metric (in which item loadings are fixed but thresholds are free) and scalar (more stringent; in which both item loadings and thresholds are fixed). Before conducting formal multiple-group CFA, model fit for the five-factor SRQ-A model was first tested for the total sample, and then the subsamples of males and females were examined separately. Next, we conducted multi-group CFA (MG-CFA) to test for the two types of invariance across males and females.

    To test for metric invariance, item loadings were constrained to be equal across the two genders, and we then assessed whether this model differed significantly from an unconstrained (i.e. configural) model in which both loadings and thresholds were free between the two groups. To test for scalar invariance, both item loadings and thresholds were constrained to be equal across gender, and this model was then compared with the configural model. (Note that items 1, 2 and 9 were omitted from the MG-CFA models because the female group did not have responses to all possible values (items 1 and 2: no females answered value 1; item 19: no females answered value 2 or 3).

    If the incremental change in the CFI (ΔCFI) between the configural and the MG-CFA models is less than or equal to 0.01, this indicates that the two models within the comparison do not differ statistically in terms of fit [7]. This would suggest relatively good to strong measurement invariance across gender [43]. Since the SRQ-A items are ordinal and the WLSMV estimation procedure was used, we also used the Mplus DIFFTEST procedure to generate traditional χ2 difference tests.

    The FFM based on the adult version of the questionnaire achieved good fit using the data from the total sample of adolescents (χ(160)2=659.69,p < 0.001; CFI = 0.90; RMSEA = 0.07, 90% CI = 0.07–0.08). Factor loadings were in the range 0.33–0.82 (mean = 0.65, s.d. = 0.13) and are shown in table 1; a summary of the CFA results is shown in figure 1.

    What happens when a red blood cell dies?

    Figure 1. Social Reward Questionnaire—Adolescent Version (SRQ-A). Correlation coefficients are in bold, **p < 0.01, only significant correlations are shown; standardized factor loadings are in italics.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Table 1.Standardized factor loadings from the five-factor CFA.

    factorloadingitem number
    Admiration0.731
    0.657
    0.8010
    0.6316
    Negative Social Potency0.703
    0.485
    0.688
    0.7912
    0.7815
    Passivity0.8211
    0.5418
    0.6820
    Prosocial Interactions0.652
    0.336
    0.5014
    0.6417
    0.7419
    Sociability0.604
    0.509
    0.7513

    Cronbach α values and MICs were calculated for each subscale to assess internal consistency (table 2). Cronbach α values were in the range 0.56–0.74 (mean = 0.67, s.d. = 0.09). For some of the subscales, α falls below the cut-off point that is considered acceptable (0.70). However, Cronbach α is influenced by scale length, and these subscales contain only three to five items. It is also not a measure of scale unidimensionality. We therefore also calculated MICs, a measure of scale unidimensionality that is not affected by item number. MICs were in the range 0.25–0.43 (mean = 0.35, s.d. = 0.08). These fall within the acceptable range of 0.15–0.50 suggested by Clark & Watson [44]. All items showed good item-to-subscale-total correlations (range 0.60–0.82, mean = 0.71; s.d. = 0.07; table 2).

    Table 2.Correlations, Cronbach α values, MICs and mean item-scale correlations (MISCs) for manifest factor totals (n = 568). Corrected p-values are shown and Cronbach α values appear in italics on the diagonal.

    factor12345MICMISC
    1. Admiration0.740.430.76
    2. Negative Social Potency0.050.760.390.72
    3. Passivity0.05−0.070.670.400.77
    4. Prosocial Interactions0.39**−0.40**0.040.600.250.63
    5. Sociability0.49**0.000.020.37**0.560.300.73

    A subset of participants completed the SRQ-A twice, 7 days apart (n = 46). To select participants to complete the SRQ-A twice, two classes were chosen at random from one of the schools. Data from five participants were excluded from the test–retest analysis: one participant answered ‘strongly disagree’ to 19/20 questions at Time 2, indicating that the questionnaire was not answered carefully; one participant had more than 20% of SRQ data missing at Time 1; and three participants who gave data at Time 1 were not available at Time 2. This left a final test–retest sample of n = 41, aged 11–13 (mean = 12.54, s.d. = 0.55). The sample was 36.60% (n = 15) male.

    At each time point, subscale scores were calculated if participants had 50% or more valid data for that subscale (i.e. less than 50% missing data). Therefore, subscale scores were calculated for Admiration, Negative Social Potency and Prosocial Interactions if the participant had three or more valid answers (75% valid), and for Passivity and Sociability if the participants had two or more valid answers (66.66% valid). Pairwise Pearson correlations were conducted between SRQ-A subscale scores and Time 1 and Time 2. These were in the range 0.77–0.90 (mean = 0.81, s.d. = 0.06; all p < 0.001; table 3). These correlations indicate good test–retest reliability.

    Table 3.Test–retest reliability: Pearson correlations between factor subtotal scores at Time 1 and Time 2 (time interval = 7 days).

    factorcorrelation between Time 1 and Time 2
    Admiration0.84
    Negative Social Potencya0.77
    Passivity0.78
    Prosocial Interactions0.90
    Sociability0.77

    As described in the ‘Data analysis procedure’ section, subscale scores were calculated if participants had 50% or more valid data for that subscale (i.e. <50% missing data). Cronbach α values and MICs of the TIPI and CU subscale are reported in table 4. The Pearson correlational analyses were used to explore the pattern of associations between the five SRQ subscales, the TIPI personality subscales and the CU subscale of the APSD (see table 5; only corrected p-values are presented).

    Table 4.Cronbach α values and MICs for CU scale and TIPI. Corrected p-values are shown and MICs for TIPI subscales consist of a single correlation (as each subscale is made up of two items only).

    factorCronbach αMIC
    TIPI
     agreeableness0.170.10
     conscientiousness0.230.13
     extraversion0.400.26
     neuroticism0.400.26
     openness0.290.17
    CU traits0.300.08

    Table 5.Pearson correlations between SRQ subscales and external measures of personality and CU traits. All comparisons are corrected for multiple comparisons. Correlations of p < 0.05 after correcting for multiple comparisons are in italics.

    SRQ factor
    admirationnegative social potencypassivityprosocial interactionssociability
    personality trait
     agreeablenessr0.06−0.39**0.070.28**0.11*
    n541540541541541
     conscientiousnessr0.19**−0.20**0.010.24**0.07
    n548547548548548
     extraversionr0.24**0.00−0.17**0.19**0.29**
    n549548549549549
     neuroticismr0.07−0.07−0.020.010.05
    n540549550550550
     opennessr0.20**−0.15**−0.010.30**0.26**
    n547546547547547
    CU traitsr−0.14**0.39**−0.12*−0.37**−0.08
    n554553554554554

    Each SRQ-A subscale demonstrated a distinct pattern of associations with the personality subscales, indicating that each SRQ-A measures a relatively distinct aspect of social reward. Admiration was positively associated with conscientiousness, extraversion and openness; Negative Social Potency was negatively associated with agreeableness, conscientiousness and openness; Passivity was negatively associated with extraversion; Prosocial Interactions was positively associated with agreeableness, conscientiousness, extraversion and openness; and Sociability was positively associated with agreeableness, extraversion and openness. These associations were in line with hypotheses (see the ‘Measures’ section), and provide support for the meaning of each SRQ-A subscale.

    As hypothesized, CU traits were positively associated with Negative Social Potency and negatively associated with Prosocial Interactions. In addition, CU traits were negatively associated with Admiration and Passivity. It is important to note that measures of internal consistency fell below accepted cut-offs for the TIPI and CU subscale, and associations between these measures and the SRQ-A should be made with this is mind. However, despite this, patterns of associations with these two measures are in line with those reported in adult SRQ data with more robust construct validity measure [12,13].

    Means, standard deviations and minimum and maximum scores for each subscale are given in table 6.

    Table 6.Descriptives for each subscale.

    subscaleminimummaximummean (s.d.)n
    Admiration2.337.005.46 (1.06)568
    Negative Social Potency1.007.002.77 (1.21)567
    Passivity1.006.002.69 (1.16)568
    Prosocial Interactions2.207.005.73 (0.79)568
    Sociability1.007.005.40 (1.08)568

    Given that the participants were clustered within one of two schools, we employed the COMPLEX analysis procedure available in Mplus for estimating model parameters when there is some degree of non-independence across the cases. These supplementary CFA results for the five-factor SRQ-A model were excellent (CFI = 0.98, RMSEA = 0.05), and thus provide further support for the model. Finally, to be comprehensive, we also ran a strict MG-CFA (equal loadings and thresholds), using the two schools as the grouping variable. The supplementary MG-CFA results were in-line with those conducted across gender (CFI = 0.92, RMSEA = 0.05) and provided strong evidence for parameter invariance across the two school settings.

    The MG-CFA results indicated that factor loadings could be constrained to be equal across gender without meaningful change in model fit (change in CFI > 0.01; table 7). This provides evidence for partial (metric) invariance, and indicates that the SRQ-A items discriminate social reward value similarly across male and female 11–16 year olds. (Although the χ2 difference test was significant, this test is sensitive to large sample sizes and is not recommended as the best means of assessing differences between two models [7,36]). Evidence for the more stringent scalar invariance was not found (i.e. change in CFI was at 0.01). However, the departure in model fit was by no means substantial, suggesting the degree of non-invariance in thresholds (i.e. endorsement likelihood) was relatively minor. Moreover, while the absence of strict scalar invariance indicates that the SRQ-A items are not always endorsed similarly across gender, when we examined the patterns of threshold differences by gender using plots, there did appear to be considerable similarity between females and males (see the electronic supplementary material including tables S1 and S2).

    Table 7.Total sample and MG-CFA for the five-factor SRQ-A.

    CFA modelCFIRMSEAχ2 diffp-value
    overall model fit by total, female and male samples
    totala0.930.06
    female0.930.06
    male0.900.06
    multi-group analyses: tests for invariance across males and females
    configural (free loadings and thresholds)0.910.06
    metric (fixed loadings, free thresholds)0.910.06χ(11)2=43.520.00
    scalar (fixed loadings and thresholds)0.900.06χ(90)2=147.230.00

    This paper describes the development of the 20-item SRQ-A. The SRQ-A is a valid and reliable measure of individual differences in the value of social rewards, for use with 11–16 year olds. CFA demonstrated that an FFM based on the adult SRQ [12] fitted the adolescent data well, and measurement invariance analyses indicated that this model applied to both males and females. The five latent factors are represented by five manifest variable subscales of social reward domains: Admiration, Negative Social Potency, Passivity, Prosocial Interactions and Sociability. Further analyses demonstrated that the SRQ-A has good internal consistency, test–retest reliability and construct validity. In particular, we found that CU traits were positively associated with the Negative Social Potency subscale (enjoyment of being cruel and antagonistic) and negatively associated with the Prosocial Interactions subscale (enjoyment of being kind and prosocial). This indicates that adolescents with high levels of CU traits show a pattern of ‘inverted’ social reward, in which being cruel is enjoyable and being kind is not, much like adults with high levels of psychopathic traits [13].

    Associations between the SRQ-A subscales and personality domains provide support for the meaning of each subscale and indicate that they are capturing distinct types of socially rewarding interactions. For example, the personality trait extraversion was positively associated with SRQ-A Admiration. Individuals with high levels of extraversion are sociable, friendly and seek out social interactions (e.g. [45]), so the positive association with the enjoyment of admiration found here provides support that the Admiration subscale is capturing enjoyment of positive social attention. Agreeableness, a personality trait describing warmth and kindness (e.g. [45]), was positively associated with SRQ-A Prosocial Interactions and negatively associated with SRQ-A Negative Social Potency. Agreeableness has previously been associated with motivation to use prosocial tactics, such as compromise, to resolve conflict [46]. Its positive association with SRQ-A Prosocial Interactions provides support that this subscale is capturing enjoyment of behaving prosocially, and its negative association with SRQ-A Negative Social Potency provides support that this subscale captures the reward value of behaving in an antagonistic and antisocial manner towards others.

    Analyses with CU traits indicate that adolescents with high levels of CU traits show a pattern of ‘inverted’ social reward, in a similar manner to adults with high levels of psychopathic traits [13]. Specifically, adolescents with high levels of CU traits report more enjoyment from being cruel, callous and antagonistic towards others, and less enjoyment from having affiliative, prosocial exchanges with others. This evidence of increased enjoyment from antisocial behaviour is in line with other research showing a moderate association between psychopathic and sadistic personality traits in adolescents [47]—indicating that adolescents with high levels of psychopathic-type traits find some enjoyment in hurting others. The positive association reported here between CU traits and SRQ-A Negative Social Potency suggests that the increased levels of antisocial behaviour seen in adolescents with high levels of CU traits (e.g. [48]) may be motivated partly by the reward value that this behaviour has for these individuals.

    Equally, the negative association between CU traits and enjoyment of prosocial relationships may provide an important explanation for why these individuals are less likely to affiliate with and behave prosocially towards others [17,22] and have long-term friendships [21]. Although low levels of prosocial behaviour and affiliation are well documented in descriptions of those with high levels of CU traits, the findings presented here are some of the first to examine why this behaviour is reduced in these individuals. The current findings present an interesting possibility that prosocial interactions and relationships may simply not be as enjoyable—or enjoyable at all—for adolescents with high levels of CU traits. If this is the case, it makes intuitive sense that these individuals are not motivated to engage in these behaviours (unless, of course, there is some consequence that is rewarding for them).

    This ‘inverted’ pattern of social reward—where being cruel is enjoyable and being kind is not—should be taken into account when considering interventions to reduce levels of CU traits. For example, it would be interesting to explore whether it is possible to modify the low levels of social reward experienced from prosocial interactions by these adolescents. Some authors have suggested that a particular emphasis on parental warmth and responsiveness may be important when designing interventions for adolescents with high levels of CU traits (e.g. [49,50]), and in line with this, children both with and without high levels of CU traits responded equally well to an intervention that focused on increasing parental positive reinforcement to encourage prosocial behaviour [51]. More longitudinal, randomized control studies are needed to explore if such parental interventions are effective at reducing CU traits, and importantly if the mechanism of change is associated with a change in the reward value of prosocial exchanges. Equally, it would be important to understand whether the increased reward value of antisocial behaviour could be reduced, for example, by emphasizing potential costs to the individual of behaving antisocially, or potential gains of behaving prosocially.

    It is important to note differences between the current adolescent sample and the previous adult sample [13] with regard to associations between the SRQ subscales and psychopathic/CU traits. For example, the current study showed a modest negative association between CU traits and both enjoyment of admiration and enjoyment of passivity. In contrast, the previous adult sample showed positive associations between interpersonal psychopathic traits and enjoyment of admiration, and between interpersonal, lifestyle and antisocial psychopathic traits and enjoyment of passivity [13]. This presents an interesting possibility that gaining approval and praise (Admiration) and allowing others to take the lead (Passivity) may have different reward values for adults with high levels of psychopathic traits compared with adolescents with high levels of CU traits. For example, for adults, the Admiration subscale may be interpreted as social attention that is flattering and useful for manipulating others. Adolescents, on the other hand, may interpret the items in the Admiration subscale as indicative of gaining approval from authority figures such as teachers and parents, which may be undesirable to them. Similarly, passivity may be enjoyable for adults with high levels of psychopathic traits as allowing others to make decisions means less effort for the individual and more opportunity to be a ‘free loader’ [13]. By contrast, for adolescents, passivity may be interpreted as submission to authority. This may be particularly undesirable for adolescents with high levels of CU traits, who tend to rebel [17]. We acknowledge the speculative nature of these interpretations; further investigations are necessary to better understand how adults and adolescents interpret the meaning of the different SRQ subscales.

    Additionally, it is interesting to interpret the negative association between CU traits and the Admiration subscale in the context of what is known to be rewarding for typical adolescents. The current findings suggest that, as CU traits increase, admiration or approval from others is less rewarding. This is particularly interesting as this source of reward is considered to be so potent for typical adolescents [3,4]. As the SRQ-A asks participants to answer the questions in relation to all people in their lives, it is not clear whether this reduced reward value of admiration relates to interactions with peers, parents, teachers or some combination of these. It would be interesting to explore this further in future studies to better understand in what way gaining admiration and approval from others, a salient reward for most adolescents, may be less rewarding for those with high levels of CU traits.

    There are several limitations of the current study. First, the items were taken from the adult SRQ [12], after the Sexual Relationships subscale was dropped and the wording of two items was adjusted. For the adult SRQ, the final items were chosen after an EFA was conducted on a wider item set (75 items); the items in the final questionnaire were those that loaded most strongly and unambiguously onto their respective factors. In the adolescent version described here, these final adult items (with one subscale removed and two items slightly adjusted) were used to conduct a CFA with an adolescent sample. It is important to consider that if an entirely data-driven approach was used with adolescents, in which an EFA was first conducted on a wider item set, different factors and items may have been discovered. However, the FFM taken from the adult SRQ did have good model fit with the adolescent sample. This indicates that, although an entirely data-driven approach was not used, the FFM used here still captures meaningful aspects of social reward in adolescents; this is also indicated by associations between the SRQ-A subscales and external variables.

    Evidence for partial (metric) invariance was found across males and females, but not for strict scalar invariance. This indicates that while the FFM captures social reward value for both males and females, and the items discriminate equally well between males and females, the threshold at which an item is endorsed may differ between genders. Nevertheless, invariance across groups is often more a matter of a degree as opposed to all-or-nothing [52]. In this sense, the pattern of item endorsements between females and males evidenced a fair degree of concordance. Further investigations with the SRQ-A using item-level analyses would be beneficial to better understand the nuances of how these items capture social reward value for each gender.

    It is important to note that the internal consistency of the two external measures in this paper was low. Future studies should assess associations between the SRQ-A and other measures of personality traits with better internal consistency. Assessing associations between the SRQ-A and a more detailed measure of CU traits, such as the Inventory of Callous–Unemotional Traits (ICU [53]), would be particularly informative. The ICU is a multi-dimensional measure that assesses three facets of CU traits: callous, uncaring and unemotional features. Owing to time limits imposed by the schools in the current study, a brief measure of CU traits was used. Analysis using the ICU, which typically has higher reports of internal consistency [53], would allow the relationship between social reward value and CU traits to be explored in more detail. It is also important to note that the test–retest sample was from only one school. The psychometric properties of the SRQ-A will be strengthened by further assessments of its test–retest reliability in a wider sample of participants.

    A final limitation is the absence of data on ethnicity and socioeconomic status (SES) for this sample. The lack of this information may limit the generalizability of the current findings. It is critical that future investigations collect more detailed information about ethnicity and SES in order to more comprehensively assess the construct of social reward in adolescence.

    In this study, we describe the development and validation of the 20-item SRQ-A. The study provides initial data to support the validity and reliability of a measure to assess individual differences in the value of social rewards, for use with 11–16 year olds. Using CFA, we demonstrated that an FFM, based on the adult version of the SRQ, had good model fit with the adolescent sample. These five factors equate to the five subscales of the questionnaire: Admiration, Negative Social Potency, Passivity, Prosocial Interactions and Sociability. The questionnaire assesses individual differences in the reward value experienced from each of these social interaction domains. In addition, we presented analyses that provided initial support for the construct validity, internal reliability, test–retest reliability and partial gender invariance of the SRQ-A. In sum, the SRQ-A is a valid and reliable measure of individual differences in the value of social reward, for use in adolescent populations.

    The data collection and consent procedures were approved by the University College London Ethics Committee (Project identifier: 0622/001). Parental consent was obtained for all participants.

    The dataset supporting this article is available as the electronic supplementary material. The dataset is also freely available for public use at Dryad: http://dx.doi.org/10.5061/dryad.n399g [54].

    All authors helped to conceive and design the study. L.F. and R.R. collected the data. L.F. and C.S.N. analysed the data. All authors contributed to interpretation of the data. L.F. and E.V. drafted the article and all authors revised it critically. All authors approved the final version of the manuscript.

    We have no competing interests.

    This work was supported by a PhD studentship from the UK Medical Research Council (MR/J500422/1) to L.F. C.S.N. was supported by the William H. Donner Foundation; R.R. was supported by the UK Medical Research Council; E.V. was supported by a Royal Society Wolfson Research Merit Award and grant support from the UK Medical Research Council (MR/K014080/1).

    Footnotes

    1 The Flesch–Kincaid Grade Level is computed using the following calculation: Grade Level = 0.39(Total words/Total sentences) + 11.8(Total syllables/Total words) − 15.59.

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3732241.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Blakemore S-J, Choudhury S. 2006Development of the adolescent brain: implications for executive function and social cognition. J. Child Psychol. Psychiatry 47, 296–312. (doi:10.1111/j.1469-7610.2006.01611.x) Crossref, PubMed, ISI, Google Scholar

    • 2

      Blakemore S-J. 2008The social brain in adolescence. Nat. Rev. Neurosci. 9, 267–277. (doi:10.1038/nrn2353) Crossref, PubMed, ISI, Google Scholar

    • 3

      Jones RMet al.2014Adolescent-specific patterns of behavior and neural activity during social reinforcement learning. Cogn. Affect. Behav. Neurosci. 14, 683–697. (doi:10.3758/s13415-014-0257-z) Crossref, PubMed, ISI, Google Scholar

    • 4

      Sebastian C, Viding E, Williams KD, Blakemore S-J. 2010Social brain development and the affective consequences of ostracism in adolescence. Brain Cogn. 72, 134–145. (doi:10.1016/j.bandc.2009.06.008) Crossref, PubMed, ISI, Google Scholar

    • 5

      Davey CG, Yücel M, Allen NB. 2008The emergence of depression in adolescence: development of the prefrontal cortex and the representation of reward. Neurosci. Biobehav. Rev. 32, 1–19. (doi:10.1016/j.neubiorev.2007.04.016) Crossref, PubMed, ISI, Google Scholar

    • 6

      Steinberg L. 2010A dual systems model of adolescent risk-taking. Dev. Psychobiol. 52, 216–224. (doi:10.1002/dev.20445) PubMed, ISI, Google Scholar

    • 7

      Cheung GW, Rensvold RB. 2002Evaluating goodness-of-fit indexes for testing measurement invariance. Struct. Equ. Model. 9, 233–255. (doi:10.1207/S15328007SEM0902_5) Crossref, ISI, Google Scholar

    • 8

      Kohls G, Peltzer J, Herpertz-Dahlmann B, Konrad K. 2009Differential effects of social and non-social reward on response inhibition in children and adolescents. Dev. Sci. 12, 614–625. (doi:10.1111/j.1467-7687.2009.00816.x) Crossref, PubMed, ISI, Google Scholar

    • 9

      Demurie E, Roeyers H, Baeyens D, Sonuga-Barke E. 2012The effects of monetary and social rewards on task performance in children and adolescents: liking is not enough. Int. J. Methods Psychiatr. Res. 21, 301–310. (doi:10.1002/mpr.1370) Crossref, PubMed, ISI, Google Scholar

    • 10

      Cromheeke S, Mueller SC. 2015The power of a smile: stronger working memory effects for happy faces in adolescents compared to adults. Cogn. Emot. 30, 288–301. (doi:10.1080/02699931.2014.997196) Crossref, PubMed, ISI, Google Scholar

    • 11

      Chein J, Albert D, O'Brien L, Uckert K, Steinberg L. 2011Peers increase adolescent risk taking by enhancing activity in the brain's reward circuitry. Dev. Sci. 14, 1–10. (doi:10.1111/j.1467-7687.2010.01035.x) Crossref, PubMed, ISI, Google Scholar

    • 12

      Foulkes L, Viding E, McCrory E, Neumann CS. 2014Social Reward Questionnaire (SRQ): development and validation. Front. Psychol. 11, 205. (doi:10.3389/fpsyg.2014.00201) Google Scholar

    • 13

      Foulkes L, McCrory EJ, Neumann CS, Viding E. 2014Inverted social reward: associations between psychopathic traits and self-report and experimental measures of social reward. PLoS ONE 9, e0106000. (doi:10.1371/journal.pone.0106000) Crossref, ISI, Google Scholar

    • 14

      Foulkes L, Bird G, Gökçen E, McCrory E, Viding E. 2015Common and distinct impacts of autistic traits and alexithymia on social reward. PLoS ONE 10, e0121018. (doi:10.1371/journal.pone.0121018) Crossref, PubMed, ISI, Google Scholar

    • 15

      Hare RD. 2003Manual for the revised psychopathy checklist, 2nd edn. Toronto, ON: Multi-Health Systems. Google Scholar

    • 17

      Frick PJ, Ray JV, Thornton LC, Kahn RE. 2013Can callous-unemotional traits enhance the understanding, diagnosis and treatment of serious conduct problems in children and adolescents? A comprehensive review. Psychol. Bull. 140, 1–57. (doi:10.1037/a0033076) Crossref, PubMed, ISI, Google Scholar

    • 18

      Lynam DR, Caspi A, Moffitt TE, Loeber R, Stouthamer-Loeber M. 2007Longitudinal evidence that psychopathy scores in early adolescence predict adult psychopathy. J. Abnorm. Psychol. 116, 155–165. (doi:10.1037/0021-843X.116.1.155) Crossref, PubMed, ISI, Google Scholar

    • 19

      Pardini D. 2011Perceptions of social conflicts among incarcerated adolescents with callous-unemotional traits: ‘You're going to pay. It's going to hurt, but I don't care’. J. Child Psychol. Psychiatry 52, 248–255. (doi:10.1111/j.1469-7610.2010.02336.x) Crossref, PubMed, ISI, Google Scholar

    • 20

      Viding E, Simmonds E, Petrides KV, Frederickson N. 2009The contribution of callous-unemotional traits and conduct problems to bullying in early adolescence. J. Child Psychol. Psychiatry 50, 471–481. (doi:10.1111/j.1469-7610.2008.02012.x) Crossref, PubMed, ISI, Google Scholar

    • 21

      Muñoz LC, Kerr M, Besic N. 2008The peer relationships of youths with psychopathic personality traits a matter of perspective. Crim. Justice Behav. 35, 212–227. (doi:10.1177/0093854807310159) Crossref, ISI, Google Scholar

    • 22

      Dadds MR, Allen JL, McGregor K, Woolgar M, Viding E, Scott S. 2014Callous-unemotional traits in children and mechanisms of impaired eye contact during expressions of love: a treatment target?J. Child Psychol. Psychiatry 55, 771–780. (doi:10.1111/jcpp.12155) Crossref, PubMed, ISI, Google Scholar

    • 23

      Hodsoll S, Lavie N, Viding E. 2014Emotional attentional capture in children with conduct problems: the role of callous-unemotional traits. Front. Hum. Neurosci. 8, 570. (doi:10.3389/fnhum.2014.00570) Crossref, PubMed, ISI, Google Scholar

    • 24

      Buhrmester D. 1990Intimacy of friendship, interpersonal competence, and adjustment during preadolescence and adolescence. Child Dev. 61, 1101–1111. (doi:10.2307/1130878) Crossref, PubMed, ISI, Google Scholar

    • 25

      Westenberg P, Drewes MJ, Goedhart AW, Siebelink BM, Treffers PD. 2004A developmental analysis of self-reported fears in late childhood through mid-adolescence: social-evaluative fears on the rise?J. Child Psychol. Psychiatry 45, 481–495. (doi:10.1111/j.1469-7610.2004.00239.x) Crossref, PubMed, ISI, Google Scholar

    • 26

      Larson R, Richards MH. 1991Daily companionship in late childhood and early adolescence: changing developmental contexts. Child Dev. 62, 284–300. (doi:10.2307/1131003) Crossref, PubMed, ISI, Google Scholar

    • 27

      Bosacki SL, Marini ZA, Dane AV. 2006Voices from the classroom: pictorial and narrative representations of children's bullying experiences. J. Moral Educ. 35, 231–245. (doi:10.1080/03057240600681769) Crossref, ISI, Google Scholar

    • 28

      Gosling SD, Rentfrow PJ, Swann WB. 2003A very brief measure of the Big-Five personality domains. J. Res. Pers. 37, 504–528. (doi:10.1016/S0092-6566(03)00046-1) Crossref, ISI, Google Scholar

    • 29

      Frick PJ, Hare RD. 2001The antisocial process screening device (APSD). Toronto, ON: Multi-Health Systems. Google Scholar

    • 30

      Kincaid JP, Fishburne RP, Rogers RL, Chissom BS. 1975Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for navy enlisted personnel. Research Branch report 8-75. Memphis, TN: Naval Air Station. Google Scholar

    • 31

      Costa PT, McCrae RR. 1992Four ways five factors are basic. Pers. Individ. Dif. 13, 653–665. (doi:10.1016/0191-8869(92)90236-I) Crossref, ISI, Google Scholar

    • 32

      Erol RY, Orth U. 2011Self-esteem development from age 14 to 30 years: a longitudinal study. J. Pers. Soc. Psychol. 101, 607–619. (doi:10.1037/a0024299) Crossref, PubMed, ISI, Google Scholar

    • 33

      Harden KP, Tucker-Drob EM. 2011Individual differences in the development of sensation seeking and impulsivity during adolescence: further evidence for a dual systems model. Dev. Psychol. 47, 739–746. (doi:10.1037/a0023279) Crossref, PubMed, ISI, Google Scholar

    • 34

      Munoz LC, Frick PJ. 2007The reliability, stability, and predictive utility of the self-report version of the Antisocial Process Screening Device. Scand. J. Psychol. 48, 299–312. (doi:10.1111/j.1467-9450.2007.00560.x) Crossref, PubMed, ISI, Google Scholar

    • 35

      Poythress NG, Douglas KS, Falkenbach D, Cruise K, Lee Z, Murrie DC, Vitacco M. 2006Internal consistency reliability of the self-report Antisocial Process Screening Device. Assessment 13, 107–113. (doi:10.1177/1073191105284279) Crossref, PubMed, ISI, Google Scholar

    • 36

      Muthén LK, Muthén BO. 2012Mplus. The comprehensive modelling program for applied researchers: user's guide. Los Angeles, CA: Muthén and Muthén. Google Scholar

    • 37

      Bentler PM, Chou C-P. 1987Practical issues in structural modeling. Sociol. Methods Res. 16, 78–117. (doi:10.1177/0049124187016001004) Crossref, ISI, Google Scholar

    • 38

      Hu L, Bentler PM. 1999Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct. Equation Modeling 6, 1–55. (doi:10.1080/10705519909540118) Crossref, ISI, Google Scholar

    • 39

      West SG, Taylor AB, Wu W. 2012Model fit and model selection in structural equation modeling. In Handbook of structural equation modeling (ed. Hoyle RH), pp. 209–231. New York, NY: Guildford Press. Google Scholar

    • 40

      Marsh HW, Hau K-T, Wen Z. 2004In search of golden rules: comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler's (1999) findings. Struct. Equation Modeling 11, 320–341. (doi:10.1207/s15328007sem1103_2) Crossref, ISI, Google Scholar

    • 41

      Schmitt N. 1996Uses and abuses of coefficient alpha. Psychol. Assessment 8, 350–353. (doi:10.1037/1040-3590.8.4.350) Crossref, ISI, Google Scholar

    • 42

      Benjamini Y, Hochberg Y. 1995Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B (Methodol) 1, 289–300. Google Scholar

    • 43

      Mokros A, Neumann CS, Stadtland C, Osterheide M, Nedopil N, Hare RD. 2011Assessing measurement invariance of PCL-R assessments from file reviews of North American and German offenders. Int. J. Law. Psychiatry 34, 56–63. (doi:10.1016/j.ijlp.2010.11.009) Crossref, PubMed, ISI, Google Scholar

    • 44

      Clark LA, Watson D. 1995Constructing validity: basic issues in objective scale development. Psychol. Assessment 7, 309–319. (doi:10.1037/1040-3590.7.3.309) Crossref, ISI, Google Scholar

    • 45

      McCrae RR, Costa PT. 1987Validation of the five-factor model of personality across instruments and observers. J. Pers. Soc. Psychol. 52, 81. (doi:10.1037/0022-3514.52.1.81) Crossref, PubMed, ISI, Google Scholar

    • 46

      Jensen-Campbell LA, Graziano WG. 2001Agreeableness as a moderator of interpersonal conflict. J. Pers. 69, 323–362. (doi:10.1111/1467-6494.00148) Crossref, PubMed, ISI, Google Scholar

    • 47

      Chabrol H, Van Leeuwen N, Rodgers R, Séjourné N. 2009Contributions of psychopathic, narcissistic, Machiavellian, and sadistic personality traits to juvenile delinquency. Pers. Individ. Dif. 47, 734–739. (doi:10.1016/j.paid.2009.06.020) Crossref, ISI, Google Scholar

    • 48

      Rowe R, Maughan B, Moran P, Ford T, Briskman J, Goodman R. 2010The role of callous and unemotional traits in the diagnosis of conduct disorder. J. Child Psychol. Psychiatry 51, 688–695. (doi:10.1111/j.1469-7610.2009.02199.x) Crossref, PubMed, ISI, Google Scholar

    • 49

      Frick PJ, White SF. 2008Research review: the importance of callous-unemotional traits for developmental models of aggressive and antisocial behavior. J. Child Psychol. Psychiatry 49, 359–375. (doi:10.1111/j.1469-7610.2007.01862.x) Crossref, PubMed, ISI, Google Scholar

    • 50

      Hyde LWet al.2016Heritable and nonheritable pathways to early callous-unemotional behaviors. Am. J. Psychiatry 173, 903–910. (doi:10.1176/appi.ajp.2016.15111381) Crossref, PubMed, ISI, Google Scholar

    • 51

      Hawes DJ, Dadds MR. 2005The treatment of conduct problems in children with callous-unemotional traits. J. Consult. Clin. Psychol. 73, 737. (doi:10.1037/0022-006X.73.4.737) Crossref, PubMed, ISI, Google Scholar

    • 52

      Neumann CS, Schmitt DS, Carter R, Embley I, Hare RD. 2012Psychopathic traits in females and males across the globe. Behav. Sci. Law 30, 557–574. (doi:10.1002/bsl.2038) Crossref, PubMed, ISI, Google Scholar

    • 53

      Essau CA, Sasagawa S, Frick PJ. 2006Callous-unemotional traits in a community sample of adolescents. Assessment 13, 454–469. (doi:10.1177/1073191106287354) Crossref, PubMed, ISI, Google Scholar

    • 54

      Foulkes L, Neumann CS, Roberts R, McCrory E, Viding E. 2017Data from: Social Reward Questionnaire—Adolescent Version and its association with callous–unemotional traits. Dryad Digital Repository. (http://dx.doi.org/10.5061/dryad.n399g) Google Scholar


    Page 14

    Cognitive intelligence is an attractive trait in a mate if it displays low exposure to environmental stressors and/or an ability to provide benefits that increase the reproductive fitness of partner or offspring (reviewed in [1]). Indeed, intelligence may have been shaped, at least in part, by sexual selection [2]. Consistent with this proposal, cognitively complex skills such as song repertoire are positively correlated with reproductive success in birds (reviewed in [3]). Moreover, other complex skills such as vocal mimicry during courtship [4] and problem-solving ability are correlated with mating success in birds [5]. Innovative or novel displays may present conspecifics with a source of valuable information to transmit between groups more generally in specific contexts, such as that demonstrated in the transmission of novel whale song across groups ([6,7]; reviewed in [8]). Collectively, creativity, as an index of cognitive intelligence (e.g. [1,8–11]), may be a particularly desirable trait in a mate or social partner.

    In humans, creativity and intelligence are attractive in a romantic partner (reviewed in [12]), with recent evidence in female twins suggesting that preferences for these traits have a heritable basis [13]. Moreover, among creative professionals, dimensions of schizotypy (i.e. potentially costly traits) such as cognitive aberrations and magical thinking have an indirect effect on their mating success via the extent of their self-reported creative activity [14]. As creative displays are thought to have particular benefits to the reproductive fitness of those who compete for the more-selective sex (e.g. [15]), most research on creativity and mate choice has focused on the attractiveness of creativity in men (e.g. [11,16]), although women's attractiveness may also be enhanced via their creativity (see [12] for discussion). Consistent with this proposal, experimentally activating both short- and long-term mating goals directly increases men's creativity, while mating goals increase women's creativity in long-term contexts only [12]. Collectively, both men and women may enhance their attractiveness through creative displays.

    Although creativity may be a desirable quality in humans and non-human species (e.g. [1,8,12]), it is unclear whether social knowledge of one's creativity has effects on attractiveness that are independent of, or are qualified by, physical indices of biological quality such as facial attractiveness ([17]; see also [18]). Indeed, studies testing for integration of social knowledge with physical cues on social judgements of attractiveness are novel (see [19] for an exception) and, in non-human species, no empirical work, to the author's knowledge, has directly tested whether simultaneous assessment of attractiveness and putative cues to intelligence shapes mate choice (reviewed in [1]). While previous work suggests that videos of creative men are judged as attractive by women when statistically controlling for differences in men's attractiveness [11], this work did not examine written creative output or consider whether their findings generalize to creative displays by women or contexts unrelated to mating, such as the attractiveness of a social partner more generally. Here, this issue is addressed using both faces and short-story extracts that, unknown to participants, have been rated for attractiveness and creativity, respectively, in order to test for effects of both story-telling ability and facial attractiveness on global attractiveness.

    As creative displays are thought to provide men with a particular advantage in mating success (e.g. [9,11,15,16]), creativity would be predicted to have an independent effect on men's but not necessarily women's attractiveness if, for example, physical attractiveness has a stronger positive effect on women's than men's mating success (e.g. [20]; see also [21]), which in turn, reduces (or eliminates) any positive effects of creativity on their attractiveness when integrated with knowledge of their physical attractiveness. However, if the effects of creativity on attractiveness are qualified by facial attractiveness, we can establish whether creativity enhances attractiveness to a greater extent when individuals judge less attractive faces (i.e. cognitive intelligence ‘offsets’ low physical attractiveness) or, alternatively, when individuals judge more attractive faces (i.e. social knowledge of creativity enhances the appeal of physically attractive social/romantic partners). Evidence that creativity has effects on attractiveness that are independent of, or interact with, facial attractiveness, would provide initial experimental evidence in light of theoretical discussion on the role of cues to biological quality versus cues to cognitive intelligence on reproductive fitness [1].

    Finally, the experiment will test whether creativity enhances attractiveness to a greater extent when judging opposite-sex faces (i.e. potential mates) or if the effects of creativity on attractiveness generalize across both mating and non-mating contexts. While evidence for the latter prediction would not rule out the utility of creativity in mate choice, it would be consistent with recent proposals that innovation is important in the transmission of valuable information across groups more generally [8]. Indeed, attraction to written creativity may generalize to non-mating contexts if, for example, language transmits useful information at no loss to the transmitter [22], and intelligence, a prestigious trait [23], is important for increasing the leverage of group members [24] and for group cooperation and cohesion [25] which, in turn, can facilitate access to mates (e.g. [26]).

    Eighty-nine participants (21 male, Mage = 23.01 years, s.d. = 8.18 years) took part in the experiment. Participants took part in class under test conditions, in partial fulfilment of course credit. The experiment was presented alongside other randomized tasks unrelated to the current research.

    A publically available database of face photographs (KDEF, [27]), which had been independently rated for attractiveness by a panel of judges, was used in the current experiment (standardized attractiveness and intelligence ratings provided by Oosterhof & Todorov [28]). Within the face set, eight male and eight female faces were selected, with four attractive faces selected for each sex and four faces around the mid-point of the entire set selected for each sex. All photographs were of Caucasian individuals taken in a standardized manner with neutral expression, direct gaze and no adornments (562 × 762 pixels). The mean attractiveness rating of the attractive faces (Mmen = 0.98, Mwomen = 1.04) was greater than the mean attractiveness rating of the less attractive faces (Mmen = −0.35, Mwomen = −0.36, t14 = 23.22; p < 0.001). Within each face set, male and female faces did not differ from one another on rated attractiveness (both t6 < 0.56; both p > 0.60). When split by gender, the attractive face sets did not differ in rated intelligence from the less attractive face sets (both absolute t6 < 1.51; both p > 0.18). The attractive men did not differ in rated intelligence from the attractive women and the less attractive men did not differ in rated intelligence from the less attractive women (both absolute t6 < 1.53; both p > 0.18).

    A separate group of 38 participants (18 male, Mage = 27.76 years, s.d. = 7.01 years) provided short-story extracts in the laboratory prior to the experiment. Using procedures adapted from Griskevicius et al. [12], participants were given up to 5 min to write a short story based on what they thought was happening in a painting (an A3 landscape colour printout of The Lovers by René Magritte, 1928). This painting was selected specifically in order to measure creativity in the context of romantic writing. Participants were not encouraged to be creative or informed that we were measuring their creativity. Each extract (Mean length = 105 words, Mean time to completion = 285 s) was then proofread by the researcher for mistakes with spelling or grammar and then rated by a panel of judges (three male, three female, Mage = 26.78 years, s.d. = 2.85 years) for eight traits related to creativity (creative, original, clever, imaginative, captivating, funny, entertaining and charming) on a 1 (not at all) to 9 (very) scale (sensu [12]). Raters were explicitly informed that each extract was produced by a separate individual looking at an identical painting and that each individual was given the same amount of time to write a short story based on what they thought was happening in the image. Raters were not shown the painting nor were they given any information about the actual painting. Agreement between judges was acceptable (Cronbach's α = 0.72). Two judges (one male and one female) only rated a subset of extracts; however, all data are included in correlational analyses. All of the dimensions related to creativity were significantly correlated with one another (all r38 > 0.47, all p < 0.01), except for ‘funny’ and some of the rated dimensions (notably ‘creative’ r38 = 0.11, p = 0.52). As the focus of the current experiment was creativity, a composite measure of creativity was created by averaging all of the ratings for each extract except its rated funniness (Mglobal creativity = 3.74, s.d. = 0.78). From these ratings, eight extracts produced by men (Mhigh male = 4.84, s.d. = 0.10, Mlow male = 3.34, s.d. = 0.39) and eight extracts produced by women (Mhigh female = 4.80, s.d. = 0.56, Mlow female = 3.34, s.d. = 0.16) were selected. To measure spontaneous creativity, extracts were only selected if the participant reported after the task that they were not familiar with the specific painting. Half of the extracts were creative and half were less creative, with extracts from the two sets differing significantly from one another on global rated creativity excluding funniness (t14 = 9.08; p < 0.001). Within each set, extracts did not differ in creativity according to the gender of the writer (both t6 < 0.12, both p > 0.91). Extracts were then randomly paired with faces by the experimenter, resulting in (separately for each gender) two attractive–creative face-extract pairs, two attractive–less-creative face-extract pairs, two less-attractive–creative face-extract pairs and two less-attractive–less-creative face-extract pairs (see examples in figure 1). Extracts and faces were matched by sex. All participants rated the same face-extract pairings.

    What happens when a red blood cell dies?

    Figure 1. Example attractive (a(i)(iii)) and less attractive ((a(ii)(iv))) male and female faces. Example short-story extracts rated as creative (b(i)) and less creative (b(ii)). Extracts and faces were matched by sex in the experiment.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Participants were informed that the task involves making judgements about individuals, and were explicitly informed that each of the 16 pictured individuals in the experiment were provided with the same painting and the same amount of time to write a short story based on what they thought was happening in the image. They were instructed that, for each individual they see, they should read their extract and rate how attractive they judge each person to be. Participants were not shown the painting nor were they provided any information about the actual painting. The task was run via surveymonkey.com. Each facial photograph was centred on the screen with the short-story extract centred below the picture (eight male trials and eight female trials). Participants were asked to indicate how attractive they thought the individual was using the scale: much less attractive than average (=1), less attractive than average (=2), slightly less attractive than average (=3), of average attractiveness (=4), slightly more attractive than average (=5), more attractive than average (=6), much more attractive than average (=7). Trial order was randomized.

    Data for each participant consist of their mean response across two trials to each combination of facial attractiveness (attractive and less attractive) and creativity of extract (high and low). This repeated measures data were calculated separately for judgements of men and judgements of women. High scores indicate high rated attractiveness based on integration of knowledge of their facial attractiveness with knowledge of their creativity.

    A mixed-design ANOVA with the within-subjects factors face sex (male and female), facial attractiveness (high and low) and creativity (high and low) and the between-subjects factor sex of participant (men and women) was conducted. This analysis revealed a main effect of facial attractiveness (F1,87 = 228.72; p < 0.001, ηp2 = 0.72) that was qualified by an interaction with creativity (F1,87 = 5.65; p = 0.02, ηp2 = 0.06). A significant effect of creativity (F1,87 = 34.23; p < 0.001, ηp2 = 0.28) was qualified by an interaction with sex of participant (F1,87 = 4.41; p = 0.04, ηp2 = 0.05).

    There was a significant two-way interaction between face sex and participant sex (F1,87 = 6.36; p = 0.013, ηp2 = 0.07) and face sex and creativity (F1,87 = 54.38; p < 0.001, ηp2 = 0.39). Consistent with hypotheses, a significant higher-order interaction was found between face sex, facial attractiveness and creativity (F1,87 = 5.94; p = 0.017, ηp2 = 0.06). No other effects or interactions were significant (all F < 1.49, all p > 0.22).

    To interpret the significant three-way interaction between face sex, facial attractiveness and creativity, separate 2 × 2 ANOVAs were conducted on the within-subjects factors facial attractiveness and creativity, for judgements of women's faces and judgements of men's faces. Analyses on judgements of men's faces revealed a significant effect of creativity (F1,88 = 87.51; p < 0.001, ηp2 = 0.50), a significant effect of facial attractiveness (F1,88 = 163.40; p < 0.001, ηp2 = 0.65) and no significant interaction between creativity and facial attractiveness (F1,88 = 0.03; p = 0.87). Paired samples t-tests collapsed across facial attractiveness revealed that men with creative story-telling ability (M = 3.75, s.e.m. = 0.09) were judged as more attractive than men with less creative story-telling ability (M = 3.08, s.e.m. = 0.09, t89 = 9.36; p < 0.001, d = 0.99). When collapsed across creative ability, men with attractive faces (M = 3.99, s.e.m. = 0.09) were judged as more attractive than men with less attractive faces (M = 2.84, s.e.m. = 0.10, t89 = 12.78; p < 0.001, d = 1.36, see figure 2).

    What happens when a red blood cell dies?

    Figure 2. Independent effects of men's creativity and facial attractiveness on their overall attractiveness. The independent effects of facial attractiveness (a, Cohen's d = 1.36) and creativity (b, Cohen's d = 0.99) on men's attractiveness were both large.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Data for judgements of women's faces revealed a significant effect of creativity (F1,88 = 5.07; p = 0.03, ηp2 = 0.06), a significant effect of facial attractiveness (F1,88 = 186.02; p < 0.001, ηp2 = 0.68) and a significant interaction between creativity and facial attractiveness (F1,88 = 22.15; p < 0.001, ηp2 = 0.20, figure 3). This significant interaction reflected, in part, high creativity reducing the overall attractiveness of less attractive women (t88 = 4.90; p < 0.001, d = 0.52). Facially attractive and creative women were preferred relative to less creative, less attractive women (t88 = 8.12; p < 0.001, d = 0.86) and relative to creative women with less attractive faces (t88 = 14.54; p < 0.001, d = 1.54). Facial attractiveness enhanced the overall attractiveness of less creative women (t88 = 6.98; p < 0.001, d = 0.74) and less creative but attractive women were preferred relative to creative but less attractive women (t88 = 14.27; p < 0.001, d = 1.51). Creativity did not enhance the overall attractiveness of women with attractive faces (t88 = 1.85; p = 0.07, d = 0.20).

    What happens when a red blood cell dies?

    Figure 3. The significant two-way interaction between facial attractiveness and creativity in judgements of women's attractiveness (ηp2 = 0.20). Asterisks indicate medium (*) and large effect sizes (**) using Cohen's d.

    • Download figure
    • Open in new tab
    • Download PowerPoint

    As written creative expression and appreciation is, by definition, highly variable from the point of view of the reader and the writer, it is important to establish whether the current pattern of results generalize beyond the stimuli used in the current experiment (i.e. to creative thinking across different contexts). Thus, a follow­-up experiment was conducted using a well-established measure of divergent (i.e. creative) thinking (the alternative uses test; [29]; see e.g. [30] for recent discussion) in order to generate simple response lists that differ in creativity. Importantly, this replication attempt will reveal the extent to which the initial higher-order interaction between face sex, attractiveness and creativity generalizes across contexts, as response lists that differ in creativity remove aspects of semantic content that are not related to impressions of the producer's creativity (e.g. other impressions of character or intent inferred from idiosyncrasies within a short piece of creative writing).

    One hundred and twenty-five participants were recruited using the ‘buy responses’ function on surveymonkey.com. All participants were volunteers (donations to charities are offered as incentives in exchange for participation), and were selected from a wide panel of respondents via surveymonkey, with the experimenter selecting target criteria of an 18–30-year-old American sample to ensure a comparable mean age to the initial experiment. The experimenter specified a sample size of 90 and estimated response rate of 75–100% (i.e. surveymonkey send the task to more respondents based on this criteria). Participants were excluded from analyses if they did not respond to all trials in the experiment or did not specify a gender of male or female. One participant was also excluded for responding at one end of the response scale on every trial. This resulted in a final sample size for analysis of 104 participants (49 of whom were male, Mage = 24.51 years, s.d. = 5.07 years, three participants did not specify their age). Online and laboratory studies produce equivalent results (e.g. [31]; reviewed in [32]).

    The follow-up experiment used identical face stimuli to the initial experiment. In order to generate a set of creative responses, a separate group of thirty-one volunteers (14 of whom were male, Mage = 24.29 years, s.d. = 6.94 years) completed a widely used measure of divergent thinking (the alternative uses task; [29]; Form B, part 2; see e.g. [30] for recent discussion) in class under test conditions. Following the user manual, participants had up to 4 min to think of up to six alternative uses for three everyday objects listed in the booklet. Responses to one object (a car tyre) were selected given a similar level of variability in the number of valid responses among males and females (Mmale = 2.29 answers, Mfemale = 2.47 answers). To measure spontaneous creativity, and consistent with the instruction book, participants were not informed that the task measures creative thinking and were not encouraged to be creative in their responses. In line with the scoring guidelines, responses were selected if the use for the object was feasible and counted only if it differed from other responses or was unique from the example provided within the booklet. For both male and female participants, the number of valid responses to this question ranged from 0 to 6 alternative uses.

    For participants who produced at least one valid response to this item (26 participants), all response lists were then rated by a separate panel of judges (three males and three females, Mage = 28.32 years, s.d. = 2.65 years) in an identical manner to the initial experiment (i.e. the same eight traits related to creativity and identical rating scale). Raters were informed that each list shows the response of one individual who was asked to list up to six alternative uses for the same everyday object (a car tyre), and that each individual was given the same amount of time, under test conditions, to do so (and to rate each extract in light of this). Agreement between raters was good (Cronbach's α = 0.86). Across raters, ratings of each list on all eight dimensions related to creativity were correlated with one another (all r26 > 0.63, all p < 0.01). Following the initial experiment, a composite measure of creativity was created by averaging all of the ratings for each list except its rated funniness (MGlobal creativity = 4.01, s.d. = 1.61). From these ratings, eight male response lists (Mhigh male = 6.38, s.d. = 1.23, Mlow male = 2.99, s.d. = 0.70) and eight female response lists (Mhigh female = 5.47, s.d. = 0.76, Mlow female = 2.44, s.d. = 0.23) were selected. Half of the response lists were creative (e.g. ‘Use as a rope swing, cut in half and place side by side to make a Loch Ness Monster sculpture, as a garden planter, re-appropriate and use as parts for a go-kart, cut up and use as knee/elbow-pads’) and half were less creative (e.g. ‘Use as a seat’), with the two list sets differing from one another on global rated creativity excluding funniness (t14 = 7.53; p < 0.001). Within each set, lists did not differ in creativity according to the gender of the participant who produced the list (both t6 < 1.50, both p > 0.18). The same faces used in the same four conditions within the initial experiment were then randomly paired with lists by the experimenter, resulting in (separately for each gender) two attractive–creative face-list pairs, two attractive–less-creative face-list pairs, two less-attractive–creative face-list pairs and two less-attractive–less-creative face-list pairs. Lists and faces were matched by sex.

    The procedure, rating scale and processing of data in the follow-up experiment were identical to the initial experiment, except that participants were recruited to take part in a task rating the attractiveness of ‘thinkers’. Participants were informed that they would be asked to make judgements based on the responses of pictured individuals to a standard cognitive task used in psychology. They were informed that each of the 16 pictured individuals was provided with the same everyday object (a car tyre) and the same amount of time to list up to six alternative uses for that object under test conditions. They were asked, for each individual, to read that person's response to the task (presented below the face, with each answer presented in list form) and rate how attractive they judge each person to be.

    A mixed-design ANOVA with the within-subjects factors face sex (male and female), facial attractiveness (high and low) and creativity (high and low) and the between-subjects factor sex of participant (men and women) was conducted. This analysis revealed a main effect of face sex (F1,102 = 14.81; p < 0.001, ηp2 = 0.13), a main effect of creativity (F1,102 = 27.42; p < 0.001, ηp2 = 0.21), and a main effect of facial attractiveness (F1,102 = 171.78; p < 0.001, ηp2 = 0.63). The main effect of creativity was qualified by an interaction with sex of participant (F1,102 = 4.23; p = 0.04, ηp2 = 0.04) and was, separately, qualified by an interaction with face sex (F1,102 = 68.45; p < 0.001, ηp2 = 0.40). The main effect of facial attractiveness was qualified by an interaction with face sex (F1,102 = 8.08; p < 0.01, ηp2 = 0.07). Consistent with the initial experiment, a higher-order interaction was observed between face sex, facial attractiveness and creativity (F1,102 = 24.86; p < 0.001, ηp2 = 0.20). No other effects or interactions were significant (all F < 1.25, all p > 0.26).

    In order to interpret the significant three-way interaction between face sex, facial attractiveness and creativity, separate 2 × 2 ANOVAs were conducted on the within-subjects factors facial attractiveness and creativity, for judgements of women's faces and judgements of men's faces. Analyses on judgements of men's faces revealed a main effect of creativity (F1,103 = 71.34; p < 0.001, ηp2 = 0.41), a main effect of facial attractiveness (F1,103 = 57.97; p < 0.001, η2p = 0.36) and, in contrast to the initial experiment, an interaction between creativity and facial attractiveness (F1,103 = 12.60; p < 0.01, ηp2 = 0.11, figure 4a). Paired samples t-tests to interpret this interaction demonstrated that overall rated attractiveness differed among all four conditions in the task (all absolute t > 3.41, all p < 0.01, all 0.32 < d < 1.13), except for creative but less attractive men and attractive but less creative men, who were equivalent in overall attractiveness (t103 = 0.20; p = 0.84). Crucially, the interaction demonstrates that the positive effects of creativity on overall attractiveness are relatively more substantial for less attractive men (d = 0.75) than they are for attractive men (d = 0.50).

    What happens when a red blood cell dies?

    Figure 4. Replication of the significant interaction between face sex, creativity and facial attractiveness on overall attractiveness (ηp2 = 0.20). (a) Creativity has a more substantial effect on the attractiveness of men with average faces than it does for men with attractive faces (asterisks indicate * small, ** medium and *** large effect sizes). (b) Facial attractiveness has a more substantial effect on women's attractiveness than does their creativity. When comparing conditions (panel b only), all differences (four comparisons) represent a large effect size (Cohen's d) unless otherwise indicated (two comparisons).

    • Download figure
    • Open in new tab
    • Download PowerPoint

    Analyses of judgements of women's faces revealed a main effect of facial attractiveness (F1,103 = 190.60; p < 0.001, ηp2 = 0.65) which was qualified by an interaction with creativity (F1,103 = 10.62; p < 0.01, ηp2 = 0.09). There was no main effect of creativity (F1,103 = 0.40; p = 0.53). Paired samples t-tests to interpret this interaction demonstrated that overall rated attractiveness differed among all conditions in the task (all absolute t > 2.63, all p < 0.01, all 0.25 < d < 1.15) except for less creative but attractive women and creative but attractive women, who were equivalent in overall attractiveness (absolute t103 = 1.69; p = 0.094, d = 0.17, figure 4b).

    In order to establish whether the reported pattern of significant results and findings in the initial experiment generalize to other sets of faces, an additional online experiment was conducted using a new set of faces that had been rated for attractiveness by a separate panel of judges.

    Ninety-eight participants (60 males and 38 females, Mage = 24.13 years, s.d. = 3.17 years) were recruited via an online testing platform (prolific academic, www.prolific.ac see [33] for a recent review of this platform in behavioural research). All participants received the equivalent of £5 per hour for participating and were selected from a wide panel of potential respondents, with the experimenter selecting the target criteria of an 18–30-year-old sample. One participant was excluded from analysis for not specifying their gender.

    The final experiment used identical short-story extracts to the initial experiment. A separate widely used database of face photographs [34] were used in this experiment. Eight male and eight female faces were selected from the full face set, with four attractive faces selected for each sex and four faces from around the mid-point of the entire set selected for each sex. The full set of photographs were of Caucasian individuals aged 18–29 taken in a standardized manner with neutral expression and direct gaze (639 × 480 pixels). Faces were standardized on pupil position and masked to remove clothing and background and to minimize hair cues. These faces were then rated in a randomized order in the centre of the screen by a separate panel of judges in an online study run via surveymonkey.com (male face ratings: N = 21 females, 12 males, Mage = 26.84 years, s.d. = 8.56 years; female face ratings: 35 males, 25 females, 2 undisclosed gender, Mage = 29.19 years, s.d. = 9.64 years) for attractiveness on a 1 (much less attractive than average) to 7 (much more attractive than average) scale. Participants were randomly allocated to rate either 36 women's faces or 36 men's faces. Scores for each face set were then standardized separately. The mean attractiveness rating of the attractive faces (Mmen = 1.54, Mwomen = 1.55) was greater than the mean attractiveness rating of the less attractive faces (Mmen = −0.06, Mwomen = −0.04, t14 = 19.42; p < 0.001). Within each face set, male and female faces did not differ from one another on rated attractiveness (both t6 < 0.24 both p > 0.82). As in the initial experiment, extracts were then randomly paired with faces by the experimenter and all participants viewed the same face-extract pairings. The procedure, rating scale and processing of data in this experiment were identical to the initial experiment.

    A mixed-design ANOVA with the within-subjects factors face sex (male and female), facial attractiveness (high and low) and creativity (high and low) and the between-subjects factor sex of participant (men and women) was conducted. This analysis revealed a main effect of face sex (F1,96 = 14.97; p < 0.001, ηp2 = 0.14), a main effect of facial attractiveness (F1,96 = 75.56; p < 0.001, ηp2 = 0.44) and a main effect of creativity (F1,96 = 12.74; p < 0.01, ηp2 = 0.12). The main effect of face sex was qualified by an interaction with facial attractiveness (F1,96 = 6.66; p = 0.011, ηp2 = 0.07). The main effect of creativity was qualified by an interaction with facial attractiveness (F1,96 = 7.85; p < 0.01, ηp2 = 0.08). No other effects or higher-order interactions were significant (all F < 2.58, all p > 0.11). The interaction between face sex and facial attractiveness reflected a more substantial effect of facial attractiveness on women's overall attractiveness (t97 = 9.53; p < 0.001, d = 0.97) than on men's overall attractiveness (t97 = 4.73; p < 0.001, d = 0.48). The interaction between creativity and facial attractiveness reflected an effect of creativity on the overall attractiveness of less attractive faces (MCreative = 3.81, s.e.m. = 0.11, MLess Creative = 3.49, s.e.m. = 0.11, t97 = 4.08; p < 0.001, d = 0.41) but no effect of creativity on the overall attractiveness of attractive faces (MCreative = 4.33, s.e.m. = 0.10, MLess Creative = 4.31, s.e.m. = 0.10, t97 = 0.30; p = 0.77).

    The data provide evidence across two experiments that social knowledge of creativity shapes differences in attractiveness judgements of men compared with women, when integrated with knowledge of their facial attractiveness. The initial experiment demonstrated a positive effect of creativity on men's attractiveness, independent of their facial attractiveness, when creativity was measured in the form of written interpretations of a painting. Of note, these two independent effects of creativity and facial attractiveness on overall attractiveness were both large in size. In the follow-up experiment, using a well-established measure of creative/divergent thinking, although the positive effects of creativity were not independent from the positive effects of facial attractiveness when participants judged men, the interaction between creativity and facial attractiveness demonstrated that creativity had a more substantial effect on the attractiveness of men with less attractive faces than it did for men with attractive faces. Indeed, creative men with less attractive faces were equivalent in attractiveness to attractive, but less creative, men. In a final experiment, where the initial experiment was repeated with a different set of face stimuli, an interaction between facial attractiveness and creativity was observed which, this time, was not qualified by a higher-order interaction with the sex of face judged. Collectively, these findings suggest, across three experiments, that creativity may provide ‘leverage’ for less physically attractive individuals as social and/or romantic partners.

    By contrast, for women, two of the three experiments demonstrated that facial attractiveness enhanced their overall attractiveness to a greater extent than creativity (written expression and creative thinking) enhanced their overall attractiveness. Indeed, across these experiments, creativity weakened the appeal of women with less attractive faces and did not benefit their attractiveness when displayed by women with attractive faces. This former unexpected finding (weakening attractiveness judgements) may suggest evidence for subtle denigration of creative women based on (low) physical attractiveness, although caution is urged in this interpretation in light of the findings of the final experiment, where creativity strengthened the attractiveness of both men and women with less attractive faces but did not enhance the attractiveness of men or women with more attractive faces. Collectively, the data provide novel evidence that creativity in the form of spontaneous written expression and creative ideas may have specific effects on men's attractiveness as a potential mate and/or social partner, although further independent replications (i) with novel face stimulus sets, (ii) across modalities (e.g. measuring indices of vocal rather than facial attractiveness) and (iii) other forms of creative expression could help to clarify the extent to which these findings generalize to responses to men compared to women.

    While complementing work on the importance of cognitive intelligence and/or creativity for mate choice in humans (e.g. [9,11]) and non-human species (e.g. [1,2]), the findings develop recent theoretical proposals on the importance of novelty and innovation in information transmission between groups more generally [6–8]. The data for male creativity and attractiveness is consistent with the initial proposal that attraction to creativity may generalize to non-mating contexts as language transmits potentially useful information at no loss to the transmitter [22]. Moreover, indices of intelligence are thought to be important for group cooperation and cohesion [25], which, in turn, can facilitate access to resources and/or mates among males of other species (e.g. [26]). Indeed, as engagement with forms of art such as literary fiction improves performance on measures of social intelligence (e.g. theory of mind tasks; [35]), attraction to innovators may facilitate access to novel information in ways that increase the intellectual and social ‘leverage’ [24] of both male and female social groups.

    These findings address an underexplored area of the literature, by testing for variation in appreciation of the creative output of men and women. While prior work demonstrates that creativity is attractive in both men and women and is enhanced in the light of motives to attract a potential mate [12], these experiments do not consider how knowledge of creativity is integrated with knowledge of physical appearance. Here, findings in two of three experiments suggest that creative output may (unconsciously) have greater net benefits on male than female success in mating competition, all else being equal. Moreover, they provide further evidence for the utility of testing for integration of social knowledge with surface cues in social judgements of others ([19]; see also [36]).

    Recent reviews have highlighted the need to test whether simultaneous assessment of observable cues to attractiveness and intelligence shapes mate choice (reviewed in [1]). Similar paradigms to the current experiment may prove fruitful in testing for integration of knowledge of cognitive intelligence with knowledge of ‘quality’, in the assessment both of potential mates and/or allies. In humans, testing for equivalent effects of creativity in other domains (e.g. musicality) in tightly controlled experiments may also prove fruitful.

    In conclusion, these findings present direct experimental evidence for a potential strategic advantage to creative displays in the form of written expression, which generalizes across contexts to creative thought more generally, and ‘offsets’ putative cues to low biological ‘quality’ in faces. Evidence for women across two of three experiments suggests that facial attractiveness enhances their overall attractiveness while creativity does not, which could potentially shape sex differences in creative output in response to social evaluation, while, across all experiments, the findings suggest that creative displays can strengthen the attractiveness of individuals both as social and romantic partners.

    All procedures were granted ethical approval from the School of Social and Health Sciences Ethics Committee of Abertay University. Participants provided informed consent before taking part in the research.

    All data files and associated codebook are contained as part of the electronic supplementary material.

    The author declares no competing interests.

    No funding has been received for this article.

    I am grateful to Dr Julia Teale for help with data collection (study one), Mike Nicholls (funded by the Carnegie Trust awarded to the author) for assistance with collection of short-story extracts and Dr Clare Cunningham for funding recruitment (study two). I am grateful to the anonymous reviewers for their helpful comments on the manuscript.

    Footnotes

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3734467.

    Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

    References

    • 1

      Boogert NJ, Fawcett TW, Lefebvre L. 2011Mate choice for cognitive traits: a review of the evidence in nonhuman vertebrates. Behav. Ecol. 22, 447–459. (doi:10.1093/beheco/arq173) Crossref, ISI, Google Scholar

    • 2

      Hollis B, Kawecki TJ. 2014Male cognitive performance declines in the absence of sexual selection. Proc. R. Soc. B 281, 20132873. (doi:10.1098/rspb.2013.2873) Link, ISI, Google Scholar

    • 3

      Boogert NJ, Giraldeau LA, Lefebvre L. 2008Song complexity correlates with learning ability in zebra finch males. Anim. Behav. 76, 1735–1741. (doi:10.1016/j.anbehav.2008.08.009) Crossref, ISI, Google Scholar

    • 4

      Coleman SW, Patricelli GL, Coyle B, Siani J, Borgia G. 2007Female preferences drive the evolution of mimetic accuracy in male sexual displays. Biol. Lett. 3, 463–466. (doi:10.1098/rsbl.2007.0234) Link, ISI, Google Scholar

    • 5

      Keagy J, Savard JF, Borgia G. 2009Male satin bowerbird problem-solving ability predicts mating success. Anim. Behav. 78, 809–817. (doi:10.1016/j.anbehav.2009.07.011) Crossref, ISI, Google Scholar

    • 6

      Garland ECet al.2011Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Curr. Biol. 21, 687–691. (doi:10.1016/j.cub.2011.03.019) Crossref, PubMed, ISI, Google Scholar

    • 7

      Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS. 2000Cultural revolution in whale songs. Nature 408, 537. (doi:10.1038/35046199) Crossref, PubMed, ISI, Google Scholar

    • 8

      Wiggins GA, Scharff C, Rohrmeier M. 2015The evolutionary roots of creativity: mechanisms and motivations. Phil. Trans. R. Soc. B 370, 20140099. (doi:10.1098/rstb.2014.0099) Link, ISI, Google Scholar

    • 9

      Haselton MG, Miller GF. 2006Women's fertility across the cycle increases the short-term attractiveness of creative intelligence. Hum. Nat. 17, 50–73. (doi:10.1007/s12110-006-1020-0) Crossref, PubMed, ISI, Google Scholar

    • 10

      Miller GF. 2007Sexual selection for moral virtues. Q. Rev. Biol. 82, 97–125. (doi:10.1086/517857) Crossref, PubMed, ISI, Google Scholar

    • 11

      Prokosch MD, Coss RG, Scheib JE, Blozis SA. 2009Intelligence and mate choice: intelligent men are always appealing. Evol. Hum. Behav. 30, 11–20. (doi:10.1016/j.evolhumbehav.2008.07.004) Crossref, ISI, Google Scholar

    • 12

      Griskevicius V, Cialdini RB, Kenrick DT. 2006Peacocks, Picasso, and parental investment: the effects of romantic motives on creativity. J. Pers. Soc. Psychol. 91, 63–76. (doi:10.1037/0022-3514.91.1.63) Crossref, PubMed, ISI, Google Scholar

    • 13

      Verweij KJH, Burri AV, Zietsch BP. 2014Testing the prediction from sexual selection of a positive genetic correlation between human mate preferences and corresponding traits. Evol. Hum. Behav. 35, 497–501. (doi:10.1016/j.evolhumbehav.2014.06.009) Crossref, ISI, Google Scholar

    • 14

      Nettle D, Clegg H. 2006Schyzotypy, creativity and mating success in humans. Proc. R. Soc. B 273, 611–615. (doi:10.1098/rspb.2005.3349) Link, ISI, Google Scholar

    • 15

      Miller GF. 2000The mating mind: how sexual choice shaped the evolution of human nature. New York, NY: Doubleday. Google Scholar

    • 16

      Greengross G, Miller G. 2011Humor ability reveals intelligence, predicts mating success, and is higher in males. Intelligence 39, 188–192. (doi:10.1016/j.intell.2011.03.006) Crossref, ISI, Google Scholar

    • 17

      Lie HC, Rhodes G, Simmons LW. 2008Genetic diversity revealed in human faces. Evolution 62, 2473–2486. (doi:10.1111/j.1558-5646.2008.00478.x) Crossref, PubMed, ISI, Google Scholar

    • 18

      Gangestad SW, Scheyd GJ. 2005The evolution of human physical attractiveness. Annu. Rev. Anthropol. 34, 523–548. (doi:10.1146/annurev.anthro.33.070203.143733) Crossref, ISI, Google Scholar

    • 19

      Quist MC, DeBruine LM, Little AC, Jones BC. 2012Integrating social knowledge and physical cues when judging the attractiveness of potential mates. J. Exp. Soc. Psychol. 48, 770–773. (doi:10.1016/j.jesp.2011.12.018) Crossref, ISI, Google Scholar

    • 20

      Todd PM, Penke L, Fasolo B, Lenton AP. 2007Different cognitive processes underlie human mate choices and mate preferences. Proc. Natl. Acad. Sci. USA 104, 15 011–15 016. (doi:10.1073/pnas.0705290104) Crossref, ISI, Google Scholar

    • 21

      Vaillancourt T. 2013Do human females use indirect aggression as an intrasexual competition strategy. Phil. Trans. R. Soc. B 368, 20130080. (doi:10.1098/rstb.2013.0080) Link, ISI, Google Scholar

    • 22

      Pinker S. 2010The cognitive niche: coevolution of intelligence, sociality, and language. Proc. Natl. Acad. Sci. USA 107, 8993–8999. (doi:10.1073/pnas.0914630107) Crossref, PubMed, ISI, Google Scholar

    • 23

      Henrich J, Gil-White F. 2001The evolution of prestige: freely conferred deference as a mechanism for enhancing the benefits of cultural transmission. Evol. Hum. Behav. 22, 165–196. (doi:10.1016/S1090-5138(00)00071-4) Crossref, PubMed, ISI, Google Scholar

    • 24

      Hand JL. 1986Resolution of social conflicts: dominance, egalitarianism, spheres of dominance, and game theory. Q Rev. Biol. 61, 201–220. (doi:10.1086/414899) Crossref, ISI, Google Scholar

    • 25

      Flinn MV, Geary DC, Ward CV. 2005Ecological dominance, social competition, and coalitionary arms races: why humans evolved extraordinary intelligence. Evol. Hum. Behav. 26, 10–46. (doi:10.1016/j.evolhumbehav.2004.08.005) Crossref, ISI, Google Scholar

    • 26

      Schülke O, Bhagavatula J, Vigilant L, Ostner J. 2010Social bonds enhance reproductive success in male macaques. Curr. Biol. 20, 2207–2210. (doi:10.1016/j.cub.2010.10.058) Crossref, PubMed, ISI, Google Scholar

    • 27

      Lundqvist D, Flykt A, Ohman A. 1998The Karolinska Directed Emotional Faces – KDEF, CD ROM from Department of Clinical Neuroscience, Psychology Section, Karolinska Institutet. Google Scholar

    • 28

      Oosterhof NN, Todorov A. 2008The functional basis of face evaluation. Proc. Natl Acad. Sci. USA 105, 11 087–11 092. (doi:10.1073/pnas.0805664105) Crossref, ISI, Google Scholar

    • 29

      Guildford JP, Christensen PR, Merrifield PR, Wilson RC. 1960Alternate uses manual and sample (B&C). Menlo Park, CA: Sheridan Supply Co. Mindgarden. Google Scholar

    • 30

      Baird B, Smallwood J, Mrazek MD, Kam JWY, Franklin MS, Schooler JW. 2012Inspired by distraction: mind wandering facilitates creative incubation. Psychol. Sci. 23, 1117–1122. (doi:10.1177/0956797612446024) Crossref, PubMed, ISI, Google Scholar

    • 31

      Wilson M, Daly M. 2004Do pretty women inspire men to discount the future?Proc. R. Soc. Lond. B 272, S177–S179. (doi:10.1098/rsbl.2003.0134) Link, ISI, Google Scholar

    • 32

      Gosling SD, Vazire S, Srivastava S, John OP. 2004Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. Am. Psychol. 49, 93–104. (doi:10.1037/0003-066X.59.2.93) Crossref, ISI, Google Scholar

    • 33

      Peer E, Brandimarte L, Samat S, Acquisti A. 2017Beyond the Turk: alternative platforms for crowdsourcing behavioral research. J. Exp. Soc. Psychol. 70, 153–163. (doi:10.1016/j.jesp.2017.01.006) Crossref, ISI, Google Scholar

    • 34

      Minear M, Park DC. 2004A lifespan database of adult facial stimuli. Behav. Res. Methods Instrum. Comput. 36, 630–633. (doi:10.3758/BF03206543) Crossref, PubMed, Google Scholar

    • 35

      Comer Kidd D, Castano E. 2013Reading literary fiction improves theory of mind. Science 342, 377–380. (doi:10.1126/science.1239918) Crossref, PubMed, ISI, Google Scholar

    • 36

      Cowan ML, Watkins CD, Fraccaro PJ, Feinberg DR, Little AC. 2016It's the way he tells them (and who is listening): men's dominance is positively correlated with their preference for jokes told by dominant-sounding men. Evol. Hum. Behav. 37, 97–104. (doi:10.1016/j.evolhumbehav.2015.09.002) Crossref, ISI, Google Scholar


    Page 15

    error_outline

    You have to enable JavaScript in your browser's settings in order to use the eReader.

    Or try downloading the content offline

    DOWNLOAD