Jay L. Lemke
City University of New York
This DRAFT was the basis for a more carefully edited article for the journal Visual Communication. The draft contains material not included for publication and a number of the references have not been completed. ONLY the final published version should be cited as authoritative.
“Travels in Hypermodality” playfully echoes Umberto Eco’s title Travels in Hyperreality, which critically examined our fascination with imitation realities that are somehow more appealing than the everyday real world. As new technologies have enabled artifice to build new relationships to the givenness of the everyday world, we have moved from oral narratives to written literature, from drawing and painting to photography, film and video, and from interaction with fixed texts to new modes of participation in the material systems that make texts dynamically responsive to our readings of them. Eco wondered why anyone would prefer a Disneyland village to a real one, as we might wonder why a child prefers a doll to a playmate or a cartoon to a home movie. Some scholars today still wonder why anyone would prefer an illustrated text to purely verbal one, or an interactive hypertext to a printed page.
Scholars are professionally obtuse. Designers on the other hand know very well that simplicity gives us welcome respite from the demanding complexity of everyday life and the less we’re constrained by realism, the more engaged our imaginations become. In many moods, people seek to actively construct meaning with whatever signs or sensory means are at hand. Both verbal text and visual images can be built to be more constraining of the meanings a reader makes or more enabling of the reader as a co-conspirator. Semiotic products can be designed to be more passive objects of contemplation or more active resources for the creation of further meaning. They can invite us to follow, or they can invite us to lead. Good design builds in both functions, in varying combinations, depending on the known purposes of designer and client and the imagined, or fantasized, purposes of a prospective user (cf. Kress & van Leeuwen 2001).
Hypermodality is one way to name the new interactions of word-, image-, and sound- based meanings in hypermedia, i.e. in semiotic artifacts in which signifiers on different scales of syntagmatic organization are linked in complex networks or webs. I will propose here that one useful way to understand the design resources afforded by hypermodality is to consider multiplicative combinations of the presentational, orientational, and organizational resources of each semiotic mode (language, depiction/imagery/graphics, and soundforms).
Hypermodality is more than multimodality in just the way that hypertext is more than plain text. It is not simply that we juxtapose image, text, and sound; we design multiple interconnections among them, both potential and explicit. In the simplest form of hypertext, we might have a web of “pages” (or paragraphs, sentences, or even single words) in which the whole or some part of the text of the page was linked to the whole or some part of another page (or even the same page) in some way other than by the default sequential convention of ordinary reading. The links might be invisible, discoverable by exploring some technology that actuates them. They might be partially explicit (e.g. a unit is marked visually as the source of a linking vector), but the target destination of the link, the nature of the meaning relation between source or anchor and target, and whether the link is reversible might not be explicit. Or all these features could be visible before activating the link. Links make hypertexts multi-sequential (cf. Aarseth 1997). There are many possible trajectories, or traversals, through the web of a hypertext. Meaning on a time- and text-scale long compared to the typical scale of linked anchor units (e.g. paragraphs or pages) becomes a creation of the user/reader that is far less predictable to the designer than in the case of a printed book whose narrative or argument has a single conventional sequence.
It’s important to be clear from the beginning in what sense hypertextuality differs from textuality. The difference is, first of all, one of medium. Typical meaning differences then arise because people exploit the affordances of one medium differently from those of another. A printed text is not itself truly linear or sequential as a medium in the sense that, say, spoken monologue is. Imagine a technology in which written words were presented to us visually one at a time. That would be a linear or singly-sequenced written textual medium. We don’t do it that way. We present written texts almost always, whether in handwriting, print, or on the computer screen as a two-dimensional array, and we exploit that visual medium in many ways. We distinguish headers and sidebars from main text. We read clusters of words at least in a single horizontal line, and taware of paragraphing and sectioning.
And our eyes wander. There are many sources of visual salience on a page, and just as our eyes traverse a painting or diagram according to salient features and vectors linking them (Arnheim 1956), we may look away from nearby words on a page to more distant words that are salient because of typeface (italic, bold, small caps), or by their recognizability (e.g. proper names or key words of interest to us), or because they happen to sit in a header or sidebar, or are the initial or final words of a paragraph or section. So do our interests wander, too. We do not always start a printed text at the title page, or the first paragraph of the main text. We may leaf through a book, glancing at this page or that; we may turn to an index, we may follow the page numbers in a table of contents, we may look from a line to a footnote, to a bibliography of references, to an author index, and back to another page. This is a traversal in the print medium, using the technology of the book, both materially (turning pages) and by means of its genre elements (page numbers, index, etc.). But it differs from the hypertext medium not simply because the technology is different—one could use the technology of hypertext to simulate a book in all these respects—but because the web of connectivity of a hypertext activates our expectations that there will be links out from any present text unit and that there will be no single default reading sequence of a main text to return to, or against which we should be reading the content of an excursus. In hypertext there is only excursus—trajectories and loops on different scales without a single unifying narrative or sequential development of a thesis are not available in the print medium.
Conversely, the print medium does afford genuine hypertextual traversals. One could indicate at the link anchors of a printed hypertext instructions for finding the target elsewhere in the printed book, and some experimental literary texts do just this, as do scholarly works with multiple internal cross-references. It’s just rather cumbersome, and slow. Hypertext technology makes for instant sequencing. The sequence generated by activating links (and not activating them) is the text-as-read. Unless there is also some default sequence indicated by the designer, there is no other complete text, there are only the text-units in no particular order. Hypertext makes us aware of the importance of text-scales. We do not necessarily make the same kinds of meanings with a text of hundreds of paragraphs that we make within a single paragraph, just as we do not make the same kinds of meanings with complex sentences that we make with single words or phrases.
Hypermodality is the conflation of multi-modality and hypertextuality. Not only do we have linkages among text-units of various scales, but we have linkages among text-units, visual elements, and sound units. And these go beyond the default conventions of traditional multimodal genres. Even on a single printed page of a magazine, newspaper, or scholarly article in the sciences, we know to connect certain graphical images with certain verbal units (via labels, captions, explanatory text) and vice versa (illustrations of narrative events, figures cited in the text). Organizational devices such as bounding boxes and nearness or juxtaposition combine with semantic content to indicate to us what goes with what across the modal divide between text and image. connection than those provided for in print genres.
The hypertextual features of hypermedia also include new forms of intertextuality. Hypertext, as originally envisioned (Nelson 1965/1974, Bush 1945), provides not just for links within a text, but also between texts. In fact the notion of a “whole text” must be re-examined in the hypertext medium. There may be minimal presentational units (called “lexias” by Landow 1997) that are clearly wholes, but on text-scales much larger than this, the only semiotic objects are the traversals, the user-made occasion-specific sequences of lexias. In fact, in many hypertexts (e.g. those written with Storyspace from Eastgate Systems) the designer/author does creates default and optional, but clearly marked trajectories or pathways: sequences of lexias with explicitly indicated links. A user-traversal may follow such a pathway, or it may branch off from it onto another path, or chart its own course. Designer-provided paths in hypertexts however are rarely or never complete; they do not pass through every lexia in the web. They are partial connections, a compromise between author-guided sequencing and reader-selected sequences.
The hypertext medium also permits connections not just from one unit to another within “the same web” (i.e. one recognized by various unifying features, see below) but between units within that web and either another whole web or units within another web. And this feature of the hypertext medium applies in hypermedia to links among text-, image-, and sound- units across different multimedia webs.
The resulting universe of texts is not a seamless web. There are still organizational scales, both those inferable from internal homogeneities and heterogeneities (so that we know when we have crossed from one insitutional website to another, even if the topic and images are very similar) and from external ones (e.g. economic and political boundaries, such as password barriers). The universal web is a fantasy, an updated version of the utopian world in which all national barriers could be passed freely (by the rich; the poor could scarcely walk into the nearest bourgeois neighborhood unaccosted by the police). There are barriers of language, which are not removed by translation (which changes meanings even as it gives partial access to them), barriers of proprietary property, privacy and secrets, barriers of nationalism and politics (foreign URLs and domains blocked by governments), and barriers of culture and values (from simple incomprehensibility to the blocking of access to “immoral” or “irreligious” sites or pages).
If we could follow the traversals of myriad users of the WorldWideWeb, we would see these barriers; most users will wander again and again in familiar territories, and very few will pass elsewhere and then only rarely. It may be useful, as I have proposed elsewhere (Lemke 2001, in press-c), to reserve the term traversal for a trajectory through hypertext (or through life) that crosses the divides between radically culturally separated domains, producing at least the possibility of some hybridity of meaning made across their disjunctions.
A Guided Sequence
I have tried in the previous section to give a rough idea of what I mean by traversals and hypermodality. I have not done more than suggest a number of technical distinctions and terminological niceties that a more systematic analysis would provide in detail. If you want to get a sense of these before proceeding, skip ahead to the Appendix and then return here. There you will find some distinctions among: the medium of hypertext, the technologies for implementing that medium, and the informational content of a particular hypertext web; the sequence of signifiers that constitute a trajectory through a hypertext web and the meanings made with those signifiers that constitute the traversal as such; the various scales or units of signifiers and meaning-making practices both extensionally (size) and in time.
The sequence of topics I have planned, if you read what follows in its default order, will first distinguish three kinds of meaning made by every semiotic act (presentation, orientational, and organizational), and then consider how meanings based on signifiers in different sign systems (language, depiction, music, etc.) can be combined or integrated to produce more specific and new kinds of meanings not otherwise available. I will illustrate these principles with analyses of some website pages and then return to issues of the politics of hypermedia design.
Multiplying Modalities: Presentational, Orientational, and Organizational Meaning
If we are concerned with the kinds of meaning that can be made with hypermedia, we need to examine two kinds resources that extend beyond the affordances of plain text. One of these is the semantics of hypertextuality, which will be considered in the next section. The other is the semiotics of multimedia, particularly the integration of verbal and visual resources for meaning.
I take the position that, fundamentally, all semiosis is multimodal (cf. Kress & van Leeuwen 1996, Mitchell 1994): you cannot make meaning that is construable through only one analytically distinguishable semiotic resource system. Even if for many purposes we analytically distinguish the linguistic semiotic resource system from that of depiction or visual-graphic presentations, and both from others such as the music-sound system or the behavioral-action system, the fact that all signifiers are material phenomena means that their signifying potential cannot be exhausted by any one system of contrasting features for making and analyzing meaning.
If I speak aloud, yes, you may interpret the acoustical sounds I make through the linguistic system as presentations of lexical items, organized according to a linguistic grammar, etc. But you may also interpret them as indexical signs of my personal identity, individuality, my social category memberships, my state of health and my emotional condition. And I may manipulate vocal features of my speech, which are phonologically and lexically non-distinctive, so that I am heard to speak the same words, even with the same formal intonation patterns of the language, but in ways that present a foreign accent, identifiable dialect-associated features, a child-like timbre, a breathy seductive tone, nervousness, etc. The skills of accomplished actors demonstrate all this quite well. Interpretation of my speech-sound-stream through the terms of the linguistic system alone does not and cannot exhaust its possible meanings in the community.
Likewise, if I choose to write down my words, eliminating the affordances of vocal speech that give rise to this supra-linguistic meaning potential, I must still create material signs, which now again afford other ways of meaning: in handwriting there are many indexical nuances of meaning, in print there are choices of typefaces and font, page layout, headers and footers, headings and sidebars, etc. Each of these conveys additional kinds of meaning about the historical provenance of the text, its individual authorship, the state of the author (in the case of handwriting), the conventions of the printer, which parts of the text are to seen as more salient, how the text is to be seen as organized logically, etc. -- all through non-linguistic features of the visible text.
Beyond this, we can hardly help interpreting word-pictures with pictorial imaginations, visualizing what we hear or read, whether as image, technical or abstract diagram, graph, table, etc. And conversely, when we see a visual-graphical image, whether a recognizable scene or an abstract representation of logical or mathematical relationships, we cannot help in most cases also interpreting it verbally. Language and visual representation have co-evolved culturally and historically to complement and supplement one another, to be co-ordinated and integrated (Lemke 1998b, 1998c). Only purists and puristic genres insist on their separation or monomodality. In normal human meaning-making practice, they are inseparably integrated on most occasions.
But how? We know that there are specific genre conventions (cf. Lemke 1998b), e.g. verbal captions to visual figures, verbal labels within visual figures, verbal explanatory text which cites or refers to visual figures, visual placement to indicate which words are to be linked how to which other words (paragraphing, outlining, tables, sidebars, headers, etc.), visual signs to connect words (e.g. arrows), etc. We also know that the meaning of an image changes depending on the verbal label or accompanying text, and the meaning of text changes depending on the accompanying visual figures. But what is the capacity of this phenomenon? What can it do, and how does it happen?
These are the basic questions of multimedia semiotics: What kinds of meanings can be made by combining verbal, visual, and other signs from other semiotic resource systems? How do the meanings of multimodal complexes differ from the default meanings of their monomodal components in isolation? How do we construe the meanings of components in multimodal complexes and of whole complexes as such?
My basic thesis is that the meaning potential, the meaning-resource capacity, of multi-modal constructs is the logical product, in a multiplicative sense, of the capacities of the constituent semiotic resource systems. When we combine text and images, each specific imagetext (cf. Mitchell 1994) is now one possible selection from the universe of all possible imagetexts, and that universe is the multiplicative product of the set of all possible linguistic texts and the set of all possible images. Accordingly, the specificity and precision which is possible with an imagetext is vastly greater than what is possible with text alone or with image alone.
That said, there are a few important qualifications. First, the existence of cultural traditions means that not all the things that can be said or pictured are said or pictured, and in particular, the probabilities for all possible combinations of textual items (on any text scale) or of all possible visual features (on any image scale) are never equal. So the Shannon information in any text or image or imagetext is always a great deal less than the maximum information possible if all combinations occurred with equal probability. In fact in semiotic terms humans make meaning by selective contextualization (Lemke 1995); we do not deal with all possible combinations and meanings, but rather we work across scales of organization of signs and events-as-sign to construe words and images appropriately for situations and genres, reading them against the intertextual frequencies with which in our experience various signifiers/features and meanings are likely to co-occur in various contexts. Within these layers of contextualization, however, the sets of possible signs still multiply the semiotic possibilities.
It would only be in a culture in which language and image were entirely redundant, where there was one and only one picture that could be associated with each text, and one and only one text that could be associated with each picture, that this multiplicative model would not apply.
Moreover, new meanings are made all the time; the universe of potential meanings is semiotically always larger than the set of meanings actually made so far. At the edge of new meanings, we require even more guidance from the cross-specifications of multi-modal representations because of the greater uncertainty in what a new meaning implies. A new word, a new kind of image, a new verbal idea take on meaning as they are used across contexts, and these include discursive contexts, multimodal representational contexts, and actional contexts at least. The new meaning comes to be as the community establishes conventions regarding how it functions and how it is represented in various contexts and various modalities of semiosis.
Consider also the issue of cross-modal translations. Even though a culture may create conventions about how, say, a painting is to be described in words, or commented on in scholarly fashion, or how a mathematical equation is to be graphically represented, text, image, and other semiotic forms are sui generis. No text is an image. No text has the exact same set of meaning-affordances as any image. No image or visual representations means in all and only the same ways that some text can mean. It is this essential incommensurability that enables genuine new meanings to be made from the combinations of modalities.
For meaning to actually multiply usefully across semiotic modalities, there must be some common denominators. At what level of abstraction can we say that images and texts and other kinds of semiotic productions make meaning in the same way?
All semiosis, I believe, on every occasion, and in the interpretation of every sign, makes meaning in three simultaneous ways. These are the generalizations across modalities of what Halliday (1978) first demonstrated for linguistic signs, when considered functionally as resources for making meanings. Every text and image makes meaning presentationally, orientationally, and organizationally. These three generalized semiotic functions are the commpotentially multiplicative hybrid meanings.
Presentational meanings are those which present some state of affairs. We construe a state of affairs principally from the ideational content of text, what they say about processes, relations, events, participants, and circumstances. For images, one could apply the same terms, recognizing what is shown or portrayed, whether figural or abstract (cf. Kress & van Leeuwen 1996). It is this aspect of meaning which allows us to interpret the child’s unfamiliar scrawl on paper through his use of the word “cat”, or his indecipherable speech through his pantomime of eating.
Orientational meanings are more deeply presupposed; they are those which indicate to us what is happening in the communicative relationship and what stance its participants may have to each other and to the presentational content. These are the meanings by which we orient to each other in action and feeling, and to our community in terms of point of view, attitudes, and values. In text, we orient to the communication situation primarily in terms of speech acts and exchanges: are we being offered something, or is something being demanded of us? Are we being treated intimately or distantly, respectfully or disdainfully? We assess point of view in terms of how states of affairs are evaluated and which rhetorics and discourses are being deployed. The actual signs range from the mood of a clause (interrogative, imperative) to its modality (uncertainty, insistence), from markers of formality to the lexis of peer-status, from sentence adverbials (unfortunately, surprisingly) to explicit evaluations (it’s terrible that). Visually, there is also a presumptive communicative or rhetorical relationship in which the image mediates between creators and viewers and projects a stance or point of view both toward the viewer and toward the content presented in the image.
Organizational meanings are largely instrumental and backgrounded; they enable the other two kinds of meaning to achieve greater degrees of complexity and precision. Most fundamentally organizational resources for meaning enable us to make and tell which other signs go together into larger units. These may be structural units, which are contiguous in text or image-space, and usually contain elements which are differentiated in function (subject/predicate in the clause; foreground/background in image composition). Or they may be cohesive or catenative chains, which may be distributed rather than contiguous, and in which similarity and contrast-within-similarity of features tie together longer stretches of text or greater extent of image as a unity or whole (repetition of words and synonyms; unity of palette).
In multimodal semiosis, we make cross-modal presentational (orientational, organizational) meaning by integrating the contributions to the net or total presentational (orientational, organizational) meaning from the presentational meanings of each contributing modality (Lemke 1998b). Indeed, in many multimodal genres (and in all multimedia productions to some extent), the presentational (orientational, organizational) aspect of the meaning of a multimodal unit (at any scale, see Appendix) is underdetermined if we consider only the contribution from one modality. It may be ambiguous or unidentifiable or simply too vague and imprecise to be useful in the context of the next larger whole or embedding activity.
As a simple formula this componential multiplicative principle might be represented as:
Pr[L,V] = Pr[L] (x) Pr[V]; Or[L,V] = Or[L] (x) Or[V]; Org[L,V] = Org[L] (x) Org[V]
In the complexity of real meaning-making, there are further complications to this basic principle. First, within a semiotic modality, presentational, orientational, and organizational meanings are not by any means totally independent of one another. The possible combinations do not all occur with equal probability, and functionally each one helps us to interpret the others, especially in short, ambiguous, or unfamiliar texts or images. Secondly, this same cross-functional phenomenon is very important in multimodal semiosis. Not only do Or[L] and Org[L] help to disambiguate and interpret Pr[L] (and it help interpret each of them), and, by our first principle, Pr[V] help interpret Pr[L] (and vice versa), but Or[V] and Org[V] also play a role in making sense of Pr[L], but moreso Pr[L,V].
Human semiotic interpretation is both gestalt and iterative. That is we recognize patterns by parallel processing of information of different kinds from different sources, where we are not aware of any sequential logic, and we refine our perceptions and interpretations as we notice and integrate new information into prior patterns in ways that depend in part on our having already constructed those prior, now provisional patterns. It is well known in the case of reading a text of some length, that we form expectations about text-to-come and we revise our interpretation of text-already-read in relation both to new text we read and to the expectations we had already formed before reading it. In dialogic interaction we know well that what was said moments ago can at some future time come to have meant something quite different from what it seemed to mean at the time it was first said.
The viewing of images proceeds in a somewhat different fashion, but still undergoes similar processes through time. We may see a certain gestalt of a whole image, but if the image is complex enough in its details, if there are many scales of visual organization embedded within one another in its composition, then we will not have taken in all the details at first glance, nor will we have become aware of the many kinds of relationships, contiguous and at-a-distance in and across the total image. We examine relationships within different scales of organization, and we move our attention along different pathways through the image until we have exhausted these possibilities and made provisional interpretations, which then lead us to examine still more details at various scales, through the iterative process which may, as with text, converge on some overall interpretation, or diverge into many possibilities, or simply be unstable.
I believe that it is customary in our culture to pay conscious attention primarily to presentational meanings, to orientational ones only in special circumstances, and to organizational ones only if you are a professional user of the medium. We rely on familiarity with genre conventions to automate our use of organizational and orientational cues and allow us to proceed directly to presentational information, at least in institutionalized use of media, where we are taught that it is only the presentational content which is important for institutional purposes. Such approaches, of course, are highly uncritical. They ignore power relationships, presupposing institutional roles. They ignore the limitations of genre conventions on possible new meanings. They increase a certain narrow kind of efficiency, and minimize the ongoing threat to the social status quo. As professional analysts and designers, we concern ourselves very much with organizational meaning in an instrumental sense: as means to orientational and presentational ends. We also pay attention to orientational meaning, but again very selectively. We may design rhetorical strategies, but we may not question our own role or imagine alternative possible relationships to the users of what we design. We are likely to adopt a particular evaluative stance toward our presentational content (desirable, likely, surprising, obligatory, usual), but we may not consider where that stance positions us in the social universe of discourses about these matters or in our social relations to others and their interests.
In the particular case of webpages, which I will analyze after some further discussion of the semantics of hypertext, I recommend starting with Organizational aspects of meaning. What are the largest wholes and the largest components of these? What are the most salient and extensive chains of similarities and contrasts within sameness? And then working down toward finer features. Then it is useful to pay attention to Orientational meaning: What is the basic stance of the creator/work to the viewer/user? What demands does it make, what options does it offer? How does it seek to constrain and impose or empower? What stance does it take toward its own presentational content: regarding probability/realism, usuality, normativity, importance, etc. And finally, what does it present by ways of states of affairs, real and potential, and the processes, relations, participants and qualities, and circumstances thereof?
That is a first iteration. Then one needs to look specifically at how the Pr, Or, and Org elements of each of the participating semiotic modalities interacts with and modifies the meanings of the others, coming through an iterative interpretative process (cf. the hermeneutic cycle) to some provisional hybrid net meanings on various scales.
The fundamental issue that I want to explore in this paper is how multimodal semiotics interacts with hypertext semantics to produce the semiotic affordances of hypermedia.
What is unique about hypertext is that it is multiply connected by design; that is, (a) the reader’s sequential pathway through the meaningful units (on many scales) of the hypertext web cannot be predicted by the author/designer and (b) the author/designer normally plans for potential meaning construction along many possible reading sequences. (See Appendix for a more detailed analysis of the hypertext medium.) This changes the relationships among author(s), work/web, and user(s) in a variety of ways. From oral speech, across the various genres of written text, to hypertext webs, there is a wide range of balance struck between imposition of sequence by the author and improvisation of sequence by the reader. An author may create a work whose value to most readers depends on their reading it in a single author-determined sequence. Or authors may permit or encourage readers to create their own sequences through the meaningful units of the work. Every text, no matter how it may seem only to offer information to a reader, always also imposes upon and makes demands of the reader in respect of meaning. It seeks to manipulate, constrain, and control the meanings made by the reader at each textscale, including the actions taken by the reader in operating the technology of the medium.
In most uses of the hypertext medium, there are severe limits on the useful degree of control of reader-constructed sequences, so that designers must anticipate many possible moves by the reader, and the combinatorial explosion of these rapidly passes beyond the capability of most designers, even over timescales very long compared to those over which readers construct traversals, to anticipate more than the most likely sequences. So hypertext, while it can work just like other more closed media, has its unique potential as an open medium. Matching the degree of openness to the anticipated uses of the web is a principal design challenge. Exploring the possible uses of the most open webs is a challenge for us all.
By the semantics of hypertext, I mean the affordances of the hypertext medium for constructing meaning-relationships along traversals. This is the analogue of what might be called longer-scale text semantics in the case of more conventional verbal media. Just as we make meanings across many paragraphs or chapters that we do not make within a single paragraph or chapter, so we can make meanings in hypertext along long traversals (across say 10, 30, 100 or more units or lexias ) that are not made in any one lexia or even across links between two lexias. We know relatively little about long-scale conventional text semantics (Lemke, in press-c). I believe that on intermediate scales (dozens of paragraphs, say) we know two basic things: (1) that meanings are made through the nested embedding of multivariate structures on different scales, particulary genre structures and extended rhetorical-argumentative structures, and (2) that meanings are also made through extended cohesion chains and cohesive harmonies that are orchestrated by their co-distribution patterns through a text and by their occasional intersection in a multivariate nexus at a shorter text scale (Lemke 1995b).
The first type of extended-text meaning is not easily made in hypertext webs that are rich in interconnection. It is possible to make use of genre-like structures, but difficult to hierarchically organize them sequentially. Cohesion chains, on the other hand, which are based on relations of similarity of units across extended text, work equally well in hypertext. I have argued elsewhere (Lemke 1995b) that thematic formations are the product of either intra- or inter-textual cumulations of meanings interpreted according to both these principles. It is also commonly said of long narrative fictions that their distinguishing meaning affordance is the building of convincingly detailed fictional universes, ones in which there is enough detail to provide explanation and motivation for long sequences of fictional events among fictional characters. One might say the same even of non-fiction histories or autobiographical narratives. In such cases the exact sequencing does not seem to matter, and this is the basis of much hypertext fiction (e.g. Michael Joyce's Afternoon), where global hypertext meaning is made progressively and cumulatively, but most fully only retrospectively. We encounter, as in a real-life exploration or mystery, one clue after another to help us build a consistent understanding (or several alternative theories) of events, and we cumulate and revise the overall pattern as we go.
It is less easy to create compelling logical arguments (cf. Kolb 1997) or coerce the reader to agreement on matters of analysis and interpretation, but it is still possible to raise for the user of a (branching, multi-sequential) hypertext web a great many interrelated questions that perhaps better reflect the complexity of the real. One may offer multiple perspectives, and indeed accommodate multiple authorship of the web, without loss of usefulness for such purposes.
I am not going to attempt here the task of developing a general, long-scale hypertext semantics, for either textual or multimodal hypermedia. I want only to sketch out some foundational notions and then see what happens in the webpage traversal examples I will provide. We need to begin, however, withat least an inventory of the shortest-scale cross-link meaning relations.
What kinds of meaning relations are typically construed across a single hypertext link? Such a link provides what is essentially an intertextual meaning relation, and we know (Lemke 1995b) that the kinds of meaning relations made between texts include those made within texts on longer scales. We also know that within texts there is a certain scale-invariance of meaning relations from the clause complex (Halliday 1994) to the rhetorical formation (Mann & Thompson 1986) and beyond. These basic relations are specializations of the semantic relations of Expansion and Projection between clauses (Halliday 1994, chap 7). A clause is a complete semiotic unit, it provides a construal of what we ordinarily consider to be an extra-linguistic state of affairs (though in fact we make sense of experience and construct what we call knowledge of the world largely though the semantic affordances of natural language). Such verbal construals may have only certain semantic relations to one another. I believe that as a first approximation the kinds of relations construed between consecutive webpages or hypertext lexias are these:
- Links which tie one topic-specific set of semantic relationships to another in the same way such sets are internally connected (Cf. thematic formations, Lemke 1983, 1995b); e.g. activity-to-actors, object-to-qualities, event-to-manner, activities linked by common actors and vice versa, etc.
- logical relations of expansion and projection: restatement, specification, exemplification, commentary; addition, exception, alternative; conditionality, causality, contextualization; quotation, opinion (Halliday 1994, chap. 7)
- rhetorical relations which further specify the logical relations, such as concession, opposition, disjunction, problem-solution, cause-consequence, proposal-evidence, events-generalization, etc. (Mann & Thompson 1986)
- Offer & response (accept, consider, demur, decline, reject, counter-offer); Demand & response (comply, refuse. etc.); more generally Offer/Demand -information, -action & response; Degrees of Offer/Demand: entice, suggest, propose, insist, etc. (cf. Halliday 1994, chap. 4)
- State-of-affairs/Evaluation (warrant, desire, importance, normativity, usuality, comprehend, humor); evaluative propagation chain elements; heteroglossic alliance/opposition (Lemke 1998a)
- Functional relations among the elements of such structures as: nominal group, clause, complex, rhetorical formation, genre (Halliday 1994, Martin 1992)
- Covariate chain element: similarity chain, co-hyponymic chain, co-meronymic chain; based on presentational or orientational features as above (Halliday & Hasan 1976, Lemke 1995)
For each of the general classes of semantic connections listed above, there are corresponding visual principles and forms. In fact, many of the verbal relations can be conceptualized as visual metaphors (e.g. chains, multi-slot structures, narrative scenes, viewpoints) . Nonverbal visual works of comparable complexity and scale show all these features, as for example in rich, traversable visual environments (e.g. architectural spaces, artist-exhibition spaces; designed landscapes, online gaming worlds), dynamic visual displays (e.g. silent films, animations, auto-scrolling displays, theatrical and dance performances), and extended static visual series or sequences (cartoon books, graphic novels, ukiyo-e print collections, long scroll paintings).
What happens when we combine textual and pictorial-graphical resources across longer scales? For example, in dialogue-scripted or narrated film, gaming worlds with dialogue and embedded texts; hypermedia genres (e.g. CD-ROM encyclopedias) and, most commonly today, webpages and websites.
I would like to illustrate some of the principles I have put forward here regarding multimodality and hypertext semantics at least for the short-range scale of single pages, page-pairs across single links, and a very short traversals of webpages. I draw on my experience of having analyzed these instances in the context of much longer traversals (e.g. Lemke, in press-a).