|UFDC Home||myUFDC Home | Help|
PLAY ALONG: VIDEO GAME MUSIC AS METAPHOR AND METONYMY
ZACHARY NATHAN WHALEN
A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF ARTS
UNIVERSITY OF FLORIDA
To Stacy-partner, friend, lover, wife.
I thank Jane Douglas and Donald Ault for their tireless investment in this project;
the Graduate Game Studies group for moral support; and, most importantly, my wife for
her patience and cookies.
TABLE OF CONTENTS
A C K N O W L E D G M E N T S ................................................................................................. iv
L IST O F F IG U R E S ...... ...... .......................... ........................ .. .. ...... ............ vi
L IST O F O B JE C T S .............................................................................. viii
ABSTRACT .............. .................. .......... .............. ix
1 STUDYING VIDEO GAM ES .................................. .........................................1
Introdu action ........................................................................................1
Context ...................... ................... ...............
2 FR A M E W O R K ................................................................12
Immersion, Engagement, and Flow .............. ............................................. 12
A L in g u istics M o d el .............................................................................................. 16
G am e G enre and M usic ...........................................................20
3 FORMS ............... .........................................23
Ancestral Form s .................................................................. ........ .. ......... .................. 23
Perspectives on Anim ation and Causality ...................................... ........... ....31
E x am p le s.....................................................................3 4
Super Mario Brothers ............ .... ......... ...... ........35
Legend of Zelda: Ocarina of Time ............................ ........ 41
Silent Hill........................ ........................45
4 CONCLUSION..................... ..................54
LIST OF REFERENCES .......................... ..................55
B IO G R A PH IC A L SK ETCH ....................................................................................... .. 60
LIST OF FIGURES
1 Relationship of the metaphoric paradigmaticc) axis of language to the metonymic
(syntagmatic) axis demonstrating a reading of the first two levels of Super Mario
B others ......... ... ... ...................................... .... ........ ............. 17
2 An example of mickey mousing in Disney's The .\ke/lii Dance ......................28
3 Mickey rides into town on an ostrich in Gallopin' Gauchos.................................30
4 Mickey flips a cigarette into the air and catches it with his disembodied teeth to
im press M innie ............... ............... ......... ....... ... ........................ 30
5 Regular (small) Mario jumping produces a musical phrase which is repeated
continuously as one plays the gam e ........................................ ...... ............... 36
6 When "Super Mario" jumps, his mickey mousing effect becomes exactly one
o ctav e low er ....................................................... 3 6
7 M ario "dying" ........................ .. .......................... .. ........ .. ........... 37
8 In the Underworld the music changes to match the shift of location that has
occurred in the story-line............................................................... ............... 38
9 Excerpt from arrangement of Underworld theme for piano...............................39
10 The space of the castle levels is even further compressed than the Underworld
lev els ......................................... .................................................. 4 0
11 A pproxim ate score for castle level...................................................... ............... 40
12 Playing the ocarina in The Legend of Zelda: Ocarina of Time. ............................42
13 Scale of base Ocarina note positions.................................................43
14 "Saria's Song" from LZ:O T ......... ................. ................................ .. ............. 43
15 "Normal" school building. Silent Hill 1999 Konami, Sony Computer
E ntertainm ent Japan ....................................................................... ....................47
16 Same space in the school- radically altered................................. ...... ............ ...47
17 Silence vs. "danger music" in RE:CV ............ ................................ ............... 48
18 "That's strange. It's getting darker".................. ......... ....................... 50
19 Further down the alley. Silent Hill............................................. 50
20 Further still. Organ sound seems to trigger when Harry steps over puddle of blood50
21 End of the alley. "W hat's going on here?"............. ...............................................51
LIST OF OBJECTS
1 Sound clip from leletwli, Dance corresponding to Figure 2. .................................29
2 Sound clip of cigarette toss, corresponding to Figure 4........................................31
3 Sound clip of "small M ario" jump effect.................... ............ ..... .......... 36
4 Sound clip Super M ario jum p effect ............................................. ............... 37
5 Sound clip of "failure" cadence ................................................... ......... .......38
6 Musical excerpt from "Overworld Theme"............................. .................38
7 Sound clip from under orld .............................................................................. 39
8 Sound clip of music in a castle level .......................................... .................40
9 "Lost W oods Them e" ............................................................ .......44
10 LZ:OT "Danger" theme-the blending of safety state/danger state musical
m etap h o rs ......................................................................... 4 4
11 Sound clip of 'normal' school with basic, ambient soundtrack............. ...............47
12 Sound clip of altered school with aggressively threatening soundtrack .................47
13 Sound clip corresponding to Figure 17 ........................................ ............... 50
14 Sound clip to accom pany Figure 18...................................... ....................... 50
15 Sound clip to accom pany Figure 19...................................... ........ ............... 51
16 Final sound clip from alley sequence.................................................................... 51
Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Arts
PLAY ALONG: VIDEO GAME MUSIC AS METAPHOR AND METONYMY
Zachary Nathan Whalen
Chair: J. Yellowlees Douglas
Major Department: English
This thesis argues for an approach to studying video game music such that video
game music can be seen as an integral part of games' overall semantic structure. The
terms metaphor and metonymy are borrowed from linguistics to describe two key
functions of video game music. First, the metaphoric function of video game music
provides a sense of space, characterization, and atmosphere in a game. It is also the way
music in games can be frightening or can evoke particular moods. Second, the
metonymic function of video game music is that which upholds the syntactic structure of
the game by compelling the player's involvement in progressing the game's narrative.
For example, game music supplies readers with clues about approaching enemies,
therefore giving the players an edge and an incentive to keep playing.
These two functions are also explored in the context of a discussion about video
game playing as a state of "immersion," "engagement," or "flow" (ideas derived from
schema theory and cognitive linguistics) which suggests that an ideal state of pleasurable
gaming is something like an act of creation or empowerment. Also, these cognitive
theories are supported with research and studies aimed at how viewers perceive and
interpret narrative information from basic shapes moving on a screen, and how different
musical accompaniments to those shapes relate to emotional interpolation or
anthropomorphism of the perceived actors.
Cartoon music is also a key context for video game music, especially in the way
that certain games use the same "mickey mousing" effect of synchronizing the
soundtrack with on-screen actions. Finally, three games are analyzed with the tools of
metaphor and metonymy to see how dominant game types relate to each other. One
conclusion drawn from this analysis is that Survival Horror games tend to exaggerate
conventions of older games to the point that the experience of playing games like Silent
Hill can be rather frightening. Using metaphor and metonymy, therefore, begins to shed
light on some of the implicit tensions between game genres that keep the field of games
interesting and exciting as an area of study.
STUDYING VIDEO GAMES
As recent attention in Associated Press stories1 and a New York Times article
attest,2 video game study is beginning to emerge from its murky status as a "an academic
ghetto." Video games provide rich opportunity for interdisciplinary study, but at least
one aspect of video games remains to a large extent undiscovered. Music in video games
has thus far remained a tangential footnote to studies that attempt to totalize the medium.
While game studies is becoming increasingly assimilated into current strains of academic
discourse, "grand unified theories" of games fail to account for the processes by which
the musical soundtrack of a game affects the user's experience.
In this thesis, I attempt to develop a workable theory of video game music that
avoids certain formalist structures of game analysis, and instead approaches the question
of music as a part of the narrative component of games. While I intend to steer clear of
the ludology vs. narratology debate, certain assumptions and allowances must be made in
my approach that will ultimately state a position; but as the necessarily limited scope of
this inquiry requires a certain focus, I hope to move quickly beyond the metacritical
questions paralyzing certain conversations in the field.3 Accordingly, my conclusion is
1 Wadhams, Nick. "Of ludology and narratology." Associated Press. 14 February 2004.
2 Erard, Michael. "The Ivy-Covered Console." New York Times. 26 February 2004: G1.
3 Some examples of this paralysis can be seen in the volume First Person Shooter: New Media as Story,
Performance, and Game where much ink is spilled defending certain approaches to studying video games.
The fact that these metacritical questions still relate to political biases within the academy currently
based on cognitive theories of perception and questions of immersion vs. engagement as
a means of understanding "flow" or pleasurability in games, but my specific theories rely
broadly on a paradigmatic/syntagmatic model adopted from linguistics.4
To appropriately set the context, I must first give an account of current research
into video games and video game music and explain how my own work fits into that
field. I also show that the video game genre adopts certain roles for music from prior
media. Specifically, early cartoon music and horror films established certain tropes
which video games rely on today. Furthermore, studies of the relationship between audial
and visual elements in older media prove useful for understanding game music, because
certain basic ideas (e.g., diegetic vs. nondiegetic musical sound) apply to video games;
and studies that analyze how viewers interpret purely visual media versus combined
visual and aural media are invaluable to my study. The interactive element of video
games requires its own analysis, so a combination of theories of "flow" with these earlier
studies of visual/aural media lead to a set of terms that I hope will clarify how music
works in the relationship of the video game to the user. I propose two terms to describe
the two primary functions of game music-"metaphoric" and "metonymic"-to discuss
how a game draws in players, both through the narrative, story elements (plot, setting,
characters) and through the challenge of solving problems or performing tasks with skill.
These terms also relate to a paradigmatic/syntagmatic analysis. Whereas every game
relies on a certain amount of both paradigmatic and syntagmatic expression, instances of
impedes any progress the field might be making toward establishing an autonomous, defensible position in
4 While it might be possible to use these linguistic patterns as a means to conclude that video games work
as linguistic expressions, I hope to avoid such grand conclusions and instead rely on the
paradigmatic/syntagmatic as an analogy for explaining what I see as two simultaneous functions of music
in a game.
music correspond more or less to either one of those axes. When music corresponds to a
paradigmatic instance of a game's expression (e.g., providing a mood to fill a specific
environment), it can be thought of as metaphoric. When music corresponds to the
syntagmatic structure of a game's progress (e.g., "reward" music played after
successfully completing a level), that music can be thought of as operating metonymy, or
as a heuristic (that is, teaching a player the rules of successful play through positive
The metaphoric behavior of game music is that which relates to the game as a story
or world. It is the function that draws the player into the experience, giving shape and
semantic meaning to that experience. When the constant background music in the classic
Super Mario Brothers switches from its sunny major theme to a tense minor theme, the
environment of the player-character has switched from broad daylight to a subterranean
cavern. This switch can be seen as paradigmatic in that the game's syntagmatic
structures of play are still in place-Mario must still move from left to right and progress
toward the final castle. The metonymic function of game music facilitates the player's
accomplishing the goals of the game. To remain with the Mario Brothers example,
whatever music is currently representing the environment increases in tempo as the end
of the level approaches. This teaches the player to move faster toward the level's
completion; and thus enforces the syntagmatic properties of the game by pushing it
forward in a contiguous progression.
Obviously, these two functions are interrelated, but some games may have a
preponderance of one or the other. Tetris, for example, has very little in the way of
metaphoric music. Though there is an extent to which all games have a syntagmatic
rhythm of alternation between "safe" and "danger," it is necessary to limit this initial
argument to certain game types. I hope to show that these concepts do provide a way of
distinguishing game types and that these concepts actively correspond to recognizable
generic distinctions in games, but for the purposes of this argument, I focus on the
Platformer and Survival Horror genres.
It is important to note that this argument does not apply to games' cut-scenes-in
effect, short movies between levels that advance the plot or provide back story-because
their widespread adoption of filmic perspectives and techniques renders their analyses
more appropriate for film theory.5
Also, the differences between video game music and video game sound can be
subtle, especially if the music has an "industrial" style as in American McGee 's Alice or
Silent Hill, and I often conflate the two for purposes of brevity and relevance. There are
some important ways in which video game sound deserves an entire analysis of its own,
but the broad strokes of my current argument apply to sound as well. Specifically, the
music/sound problem is complicated by a distinction between diegetic and non-diegetic
music in that the diegetic music functions similar to the incidental sounds that populate
an environment. Sue Morris writes that sound in First-Person Shooter (FPS) games is
used "to provide an audio complement to action on the screen .. and to create a sense of
a real physical space" (Morris 88). A successful player, Morris argues, must perceive the
game's space in 360 degrees, most of which are provided as audial information, and
music playing on a radio in the game world fits well into this purpose of implying space
5 Some very interesting work remains to be done in the area of cut-scenes, particularly in games such as
Half-Life orXIII where the player remains in control of his or her perspective while a cut scene plays. This
freedom leads to some interesting "performances" by non-player-characters inXIII as the games AI
attempts to place the action in the player's field of vision no matter how much he resists.
through sound. My argument deals more with nondiegetic music, but many of the sounds
a player hears are not generated "from" any visually represented object, and metonymic
aspects of game play often are not explicitly or necessarily musical (i.e., able to be
subjected to a melodic or harmonic analysis). In fact, the combined term "musical
sound" may be the most appropriately inclusive label. Therefore my argument applies to
many instances of game sound as well as game music.
Video games are increasingly being studied as important artifacts of culture and
communication, and the theoretical work of studying how and why games affect us is
steadily growing to include an array of approaches. The most controversial approach to
games from a humanities point of view is the adaptation of analysis techniques from prior
media such as film or television. Though the similarities between video games and films
are numerous-especially games with more narrative-driven content (like many of the
Final Fantasy titles)-some scholars see games as the "alternative semiotical structure of
simulation" and resist referring to games as objects or texts (Frasca 222). In other words,
the game an sich is the experience of play itself, not the programmed digital environment
and storyline built into the game by its programmers. Similarly, Markku Eskilinen has
famously made the analogy that "if I throw a ball at you, I don't expect you to drop it and
wait until it starts telling stories" (Eskilinen 36). The user function in games, Eskilinen
argues, is configurative whereas user function in "literature" is interpretive. For
Eskilinen, the games are primarily "configurative" experiences because the user's
interaction is one of assembly-making the semiotic connections to construct the actual
syntax of the game. Audiences of film, however, merely "interpret" a narrative with a
pre-configured syntax, that is, configured by the film's writer, director, and editor.6 This
distinction attempts to separate games from simplified theories of game narrative and is
appropriately aimed at a user's interaction but cannot be universally true with the wide
variety of game types recognizable as video games. At any rate, the diversity in game
types and the obvious divergence of some genres of play from what we normally consider
narrative (Tetris is the favorite example of ludologists), a strict "one-to-one" adoption of
film techniques to study games is certainly not warranted. Still, the apparatus of video
games is related to that of film or television in that they all require at least a screen and
transmit visual information, and theories of filmic sound as aspects of cinematic
storytelling provide at the very least a starting point for examining how music and sound
has a role in video game play.
The political subtext of the ludology versus narratology debate is clearly rooted in a
question of disciplinary investment. A "colonization" metaphor is frequently invoked
with games as an undiscovered continent poised for conquer by existing academic
disciplines. Espen Aarseth warns against this disciplinary colonization in his seminal
editorial "Computer Game Studies Year One":
Making room for a new field usually means reducing the resources of the existing
ones, and the existing fields will also often respond by trying to contain the new
area as a subfield .. games are not a kind of cinema, or literature. (Aarseth n.pag.)
Aarseth suggests elsewhere that so-called "narrativist" approaches to video games
redefine narrative into something unrecognizable ("Genre Trouble" 49), however, the
alternative is far from clear. In other words, formalizing games and game types under a
6 1 believe there are a number of contradictions involved in this argument, but this writing does not permit a
sufficient rebuttal. At least, his argument is a good example of the "anti-narrativists" in the present debate
within game studies.
new vocabulary or one derived from computer science does not necessarily put the study
"closer" to the games, and the best answer may be that the fundamental question of
"What's at stake in game studies" has yet to be answered by either ludologists or
narratologists. At least for the time being, the current "anything goes" environment
Eskelinen bemoans ("Towards Computer Game Studies" 36) allows a variety of voices to
bring significant scholarly attention to games. Time will tell what model is the best fit;
therefore, it is still more appropriate for existing academic structures and disciplines to
turn their attention to include and account for video games. Certain approaches from film
theory offer starting points for understanding games, but as there is considerable
resistance to film study incorporating video games (cf. Roger Bellin's comments in the
New York Times) a few preliminary allowances must be made.
The most common approach to "games-as-remediated-film" is to mistakenly
identify mechanisms or apparati of film with their apparent parallels in games. For
example, the collection Screenplay: cinema/videogame/interface contains several essays
which casually conflate the player perspective in games with "the gaze" in cinema. For
example, Wee Liang Tong and Marcus Cheng Chye Tan write of Real Time Strategy
games which allow the player to manipulate her perspective on the playing field that
"playing the game thus becomes almost synonymous with directing a movie" (101). This
kind of slippage is fodder for the often strongly-worded division between ludologists and
so-called narratologists, but video and computer games do exist within a history of media
and they do communicate through the familiar pathways of sight, hearing, and touch.
Though the ultimate interaction between the game and its audience may be different from
that of film to its audience, reception-oriented analyses of both films and games will
share some basic assumptions and critical techniques and lead to conclusions that better
account for the visual/aural/tactile content of games. Whereas the apparatus, technology,
and context for films and games hold obvious differences, studies are likely to find fewer
differences in the experience of viewing films and playing games, Eskilinen's objection
notwithstanding. Therefore, while my analysis will borrow certain conclusions from
studies of film sound and adopt a linguistic model to account for two functions of video
game music, my approach will begin with the musical experience of certain games, rather
than a categorical definition of games from which to deduce game musical structures. I
hope to avoid this type of "top down" logic by building from studies that examine the
basic ways in which aural and visual information combine in our perception to create
The disparate generalizations and controversies of definition demonstrate that
games themselves are slippery objects, and the fact that critics have yet to reach a
consensus on a critical vocabulary arises from the difficulty of describing games in
assignable categories. After establishing a relationship to film music, the second problem
in an approach to video game music has to do with game type. Though there are
definitely some universal principles which apply to many types of games, certain game
types lend themselves more readily to musical analyses. Unfortunately, the concept of
genre in games lacks consensus, and certain assumptions must be made to proceed past
the problem of generic formulation in games. As a means of addressing this problem,
David Thomas proposes a "vocabulary" of video game criticism. He attempts to label
each possible element of games and game play worth discussing and arrives at broad or
awkwardly conflated headings like "graphics" and "character" within a loosely
associational hierarchy, and he needlessly separates, for example, "camera" from "point
of view" (as Nick Montfort has commented on GrandTextAuto.org). While it is clear
that all games have graphics, and many of them have characters, it is difficult to use these
terms except in the specific contexts from which they derived. In other words, this
vocabulary can neither apply to every game nor offer a grand unified theory of games.
Nor can this vocabulary be "rebuilt" into games or game-types like ingredients.
Questions of what makes one game different from another remain unanswered. Aarseth's
textonomy offers a far more detailed hierarchy of games and types, but its textual focus is
as inflexibly reductive as Thomas's is vague. Aarseth's approach incorporates all
"ergodic" literature and does so with a context of textual criticism, but when these
conclusions are applied to the question of ludology, Aarseth begs his own question by
defining games as essentially not narratives at the outset of questioning whether games
are narratives. Game types, then, are seen as immanent behaviors of built-in, pre-
programmed interactive objects. Instances of textuality are incidental to the experience
itself, and so the question of pleasurability is lost. Of the many approaches to game type,
Aarseth's technique is most like a biologist's, picking apart the object of study to
compare how it works against others of similar type. However, many generic labels of
video games like Role-Playing Game, Real-Time Strategy, and Action describe a player's
actions, not the game's internal architecture or structure. In this vein, Mark J.P. Wolf s
efforts have been more in line with popular generic labels, but his cumbersome list of 46
game genres7 weakness is that, despite his variety of game types, it is as difficult to place
any specific games within one of Wolf s categories as it is to locate specific games by
Catalogued in chapter 6 of The Medium of the Video Game.
category on popular gaming websites. Generic labels such as "demo" and "abstract,"
furthermore, do not seem that informative. Therefore, I propose that alternatives to
models which avoid totalizing discourse are more appropriate for game studying.
Analyzing specific games in light of a specific game phenomenon, music, and in terms of
a cognitive approach to reception allows for a richer understanding of what happens
when we play games and why they are so important to us. Clearly, the seduction of the
text in video games has everything to do with the enjoyment of the text, and this state of
enjoyment or "flow" has everything to do with the music that accompanies the visual and
In short, video game music allows for an analysis which borrows from elements of
a narrative theory to argue for a re-evaluation of the video game as a technical apparatus
which actively positions the viewer/user to respond to and interact with a system. This
thesis will build on a rather small body of writing on video game music including
Mathew Belinkie's useful history of game music online at the Video Game Music
Archive (www.vgmusic.com), David Bessell's chapter in Screenplay, and Paul Weir's
dissertation on sound design and structural approaches to music in games. Robert Bowen
has also provided an insightful analysis of Atari 2600 games as musical products
themselves, mapping musical structure onto the sound effects and programming
capabilities of the console. Belinkie's paper is a rich history of the most influential
composers working in video games, and though Bessell's chapter provides an interesting
analysis of several games, his approach fails to take game type into consideration and
instead compares and contrasts three games of wildly different type and structure.
Questions of game type are necessarily elusive, but Bessell's comparison of the games
Cool Boarders 2, Alien Trilogy, and Medievil 2 is muddied by the fact that these games
represent widely different genres. Weir's work is perhaps the most useful because it is in
many ways the practical counterpoint to my theoretical analysis in that Weir works in the
game industry and is arguing for a sound design practice that incorporates structural (i.e.,
related to games' programming architecture) and interactive elements that game music
Gamasutra, an webzine for game developers, hosts regular feature articles on
sound and music design for games, but the three aforementioned articles and a handful of
others represent nearly all of the dedicated academic work on video game music within a
humanities mode. Many influential scholars have mentioned video game music as a part
of broader discussions of games, but this thesis is one of a very few works focused
explicitly on developing a unique model for understanding how video game music works
as a narrative expression. The fact that this work is currently marginal at best to game
studies' more active discourse suggests that the music in games has been taken for
granted and that this area has great potential for further inquiries.
Immersion, Engagement, and Flow
Avoiding the reductive and ultimately useless ludology/narratology context
requires that I clarify a few terms that will lead to an alternate, more productive mode of
game scholarship. The terms "immersion" and "engagement" have been invoked
generally refer to the process of reading, specifically reading for pleasure, but in
introducing a third term "flow," J. Yellowlees Douglas and Andrew Hargadon presents a
context for describing the quality of interacting with a hypertext or interactive narrative
such that an ideal condition of flow in which "self-consciousness disappears, perceptions
of time become distorted, and concentration becomes so intense that the
game...completely absorbs us" is achieved as a dialectic between unconscious states of
immersion and conscious moments of engagement (Douglas and Hargadon 204). Victor
Nell's description of "ludic reading"8 demonstrates the concept of immersion and its
correlative terms "absorption" or "escapism" and what it achieves in approaching the
flow state. "Like dreaming, reading performs the prodigious task of carrying us off to
other worlds" (Nell 2). Immersion is giving in to the seduction of the text's story, to be
blissfully unaware of one's surroundings and the passing of time as one escapes into the
pleasure of reading. By contrast the experience of being engaged with narrative (or any
other semantic object or expression) involves an abstracted level of awareness of the
8 It is important to note Nell's use of the term "ludic" here. He is writing about the playful enjoyment of
reading books, but as the term suggests, his conclusions about immersion are applicable to play states in
object qua object. In schematic terms, immersion is the act of relying on learned
behavioral scripts at a level of automacy-being "in the moment" without having to be
aware of what it takes to be in the moment-while engagement is the process of learning
the scripts and requires an objective awareness of the object supplying the new schema.
In other words, one engages everyday objects (vending machines, laptop computers) on a
semantic level that builds on behaviors learned from past experiences and rhetorical cues
from the schema itself.
Engagement, however, is the opposite case when one is forced to adapt to new
experience. Experiences with everyday objects often stop short of immersion, because the
object fails to provide an avenue for escape or disrupts user expectations by failing to
perform its part in the script. In practice, immersion and engagement provide a
continuum of experience, and to the extent that texts rely on the same cognitive processes
as the "real world," successful achievement of a flow state can be likened to being
actively immersed in the moment of engagement. Douglas and Hargadon provide the
examples of artists, musicians, and athletes who because of their skill in manipulating a
schema exhibit symptoms of flow (e.g., distorted sense of time, sense of freedom or
abstraction9) because their interaction with schema relies on a proficient degree of
agency. In video games, successful play often involves both an understanding of prior
scripts and an ability to intuitively engage new scripts by acting within an abstracted
9 An animated short based on The Matrix provides a rough illustration of this concept. In "World Record,"
a world-class sprinter "wakes up" from the illusion of the matrix in a key race. His athletic performance is
so focuses on complete control of his "matrix body" that he is able to bend the rules of reality and his mind
is freed from the matrix to become temporarily aware of his real body imprisoned in a holding pod. The
narrative ends cryptically, but this provides an analogy for a flow state in relation to an interactive in that
flo\\" involves freedom from awareness of the scripts of the interactive.
For example, games that mix genre frequently require adapting to multiple styles of
play. Grand Theft Auto: Vice City requires skill in driving and in firing weapons from
both third and first person points-of-view. The game's graphics engine and controller
layout clearly favors the driving portion of the game, and players often complain of
difficulty in manipulating the player-character through third person view gun battles
where the game's over-the-shoulder "camera" has difficulty negotiating interior walls.
This problem frequently threatens to break-the-frame of the player's immersion into the
game's world by forcing frustrated engagement with the control pad, but something about
the balance between the game's unintended challenges and the game's rewards yields a
fulfilling sense of expertise when I successfully play the game. This feeling of efficacy
contributes to the experience being characterized as a condition of flow in that the
unification of efficacy with a compelling narrative yields something like a creative flow
Music relates to conditions of flow in at least two primary ways. First, as a
metaphoric function, music works to create the specific environment or diegesis the
player is immersed in. There are some important distinctions to be made in this regard
concerning diegetic versus non-diegetic sound, but in terms of the paradigmatic axes of a
game experience, music is what draws the player in. Also, at the syntagmatic level,
music often serves as a metonym for progress in the game. In the examples I will
discuss, music is most often used as positive reinforcement for good or bad performance
in the game, thus encouraging the player to maintain the syntagmatic continuity of the
game experience by successfully progressing through the game's content. Music can also
literally be a heuristic device in a game like Legend of Zelda: Ocarina of Time where a
player must memorize and play specific musical phrases to access locations and special
abilities in the game. A game is only syntagmatically contiguous as long as a player is
advancing, but it is possible for a game to be diegetically immersive along its specific
paradigmatic presentation. In order for a user to achieve the desired flow state of game
play, she must embrace the paradigmatic gestures of metaphoric function of game music
and respond to and interact with the metonymic functions.
Music has an impact on flow in at least one other regard. Many writers (Juul,
Eskilinen, Douglas, Morris) have written about the perceived distortion of time
experienced by committed game players, and it is important to note that several studies of
filmic sound also conclude that music helps accomplish a similar suspension of temporal
disbelief. Annabel Cohen gives an example of a cinematic event in which a young
baseball player leaps to catch a fly ball. The action suddenly becomes slow motion, and
the audience collectively holds its breath along with the now silent soundtrack.
Triumphant music resumes when the hero returns to earth with the ball safely in his
glove, but the film's silence during the protracted moment of suspense does not strike the
audience as an odd moment of temporal instability because the music works to regulate
the flow of the movies temporality (Cohen "Perspectives" 361). The frame of immersion
is not broken for the audience because the return of sound "narrates" the tempo of the
diegesis. The same analysis also applies to game music, and this temporal phenomenon
works equally in metaphoric and metonymic functions.
A similar question that is often confused with immersion has to do with apparatus
theory. These theories examine the effects of the physical technology of viewing cinema
or television. Specifically, these apparati yield dominant viewing modes and, whereas
cinema audiences employ "the gaze," viewers' relationship to television can be described
as "the glance" (Flitterman-Lewis 217). Morris extends this analysis to video games and
concludes that "if film has 'the gaze' and television has the viewer's 'glance,' then [First-
Person Shooter] games have the penetrating 'stare'" (Morris 90). The key idea of
"penetrating" the screen allows for certain rich psychoanalytic approaches to video
games,10 but it is a mistake to confuse this imagined literal immersion in the 3D
environment of the game with narrative immersion in the game play experience. This
perspective-based immersion is clearly something different, and though this may be part
of a phenomenology of video games, it is beyond the scope of my argument. Morris is,
however, making a comparison that characterizes the experience of serious involvement
in a video game (specifically a first-person shooter) with a similar analogy to immersion
and while it may appear that she is making the common mistake of conflating
"immersion" with a first-person point of view, Morris is in fact describing the degree of
narrativised engagement involved in successful online game play. In this version, sound
is crucial to experiencing the space in three dimensions and in the player's placing
himself within the 3D world of the game. The goal of successfully experiencing the flow
of play is cognitive or figurative immersion, and the resulting "loss of time" is similar to
the immersive experience of ludic reading.
A Linguistics Model
The syntagmatic and paradigmatic axes of language relate to the semantic
structure of language that is a dialogue between the two complimentary forces. Figure 1
shows a chart adapted from Joel Dor's introduction to Lacan that demonstrates the
10 Cf. Laurie Taylor "When Seams Fall Apart: Video game space and the player." Game Studies. 3.2
(2 i" I).
interplay of these two forces. I am adopting the metaphoric and metonymic functions as
descriptions of musical operation because these terms highlight the act of translating their
respective axes to the reader/user. This assumes that video games are semantic
constructions, and-while this inquiry implicitly argues a position on this controversy-
the functions of video game music can only be seen as metaphoric and metonymic in the
context of that linguistic model. This framework is, in a sense, of the same vein as Espen
Aarseth's formalizing methodology, but it is important to note that my use of these
structures is at this point merely to provide that framework as a context for approaching
the overall question of the experience of video game play.
Metonymy Melody Syntax Sequence
5 World 1-1 -I- I I
g World 1-2 -
Figure 1. Relationship of the metaphoric paradigmaticc) axis of language to the
metonymic (syntagmatic) axis demonstrating a reading of the first two levels
of Super Mario Brothers
Music functioning metaphorically, therefore, moves the user's experience of a
video game along the paradigmatic axis (downward in this diagram) to create an effect of
substitution. This function also operates on a basis of similarity and recognition
corresponding to received notions of musical sound in films.11 Roman Jakobson's
influential definition of metonymy is also useful for identifying functions of music which
operate along the syntagmatic axis. For Jakobson, discourse develops and proceeds
either through a recognition of similarity or of contiguity, and the metonymic pole is the
function of continuous association. Jakobson's example is a free-association test using
the word "hut" as the stimulus, and subjects were asked to record the first word that came
into their minds. Answers like "burnt out" and "is a poor little house" are said to be
metonymic or contiguous because they exist in a predicative or narrative context (the first
provides a positional syntax which makes sense grammatically-"the hut was burnt
out"-and the second provides, in addition, a semantic relationship between the idea of a
house and a hut) while answers like "cabin" or "hovel" are substitutive or metaphoric in
that they provide synonyms which "replace" the stimulus (Jakobson 42).
Seeing a video game as a mode of discourse is a bit problematic, but in the
unfolding of simulated on-screen events, there is clearly a substitutive or paradigmatic
dimension in which one object, event, or character can replace another, and there is
clearly a progression through events that relate to one another in a predicative
arrangement. By focusing on the metaphoric and metonymic functions of music, I am
1 Specifically, the "leitmotif' formula of identifying a character or object with a musical "signature"
operates metaphorically as a substitution both for that character (in an audial/visual dialectic) and for other
characters (in a temporal dialectic).
attempting to locate the transferal of the pre-programmed structure of the video game to
the experience of the player as an question of cognition.
I am not, however, attempting to equate the metaphoric function with immersion or
the metonymic function with engagement in any kind of structuralized behavior, but the
"path" of immersive information by way of music generally follows a metaphoric
trajectory and metonymic music expedites the engagement in the process. In other
words, music as metaphor and metonym work both to make the syntax of the game
coherent and consequently to contribute to the state of flow.
Music itself affords a metaphoric and metonymic and analysis which describes
loosely the way that musical structure is perceived as a coherent unit. Musical theory can
identify complex musical elements like pitch, tempo, and timbre and can describe the
effect those elements have on the perception of music. For the purposes of this thesis, a
basic feature of musical sound demonstrates a useful association with the metaphoric and
metonymic axes of language. On a surface level, at least, musical sound has sonority or
an identifiable musical character based on one sounds relationship to another. This can
occur simultaneously in a harmonic relationship or sequentially in a melodic relationship.
For example, the key interval in a minor chord or scale is the minor third. In the
key of E, the tonic or base note is E, and a G note is at the position of the minor third.
Playing these notes together produces the core of a minor chord, and substituting a G# for
the G changes the chord to major and produces a different effect. This substitution
relates the chordal or harmonic nature of music to the metaphoric or paradigmatic axis of
language. Playing a melodic sequence of a G followed by an E produces a similar effect,
but in this case the relationship can be related to metonymy because the syntactical
context of the notes provides their basis for association. If this sequence is interrupted by
other notes, or a period of silence, the connection is perceived less clearly. There are
numerous other relationships in music, but the basic harmonic/melodic provides a basis
for discussing metaphoric and metonymic uses of music in that the music itself contains
and conveys this linguistic structure and uses that perceived association to reinforce
metaphoric and metonymic expressions of video games.
Game Genre and Music
Music takes on several roles in video games and different forms of music fill the
aural landscapes that accompany the visuals of video game environments. Typological
questions in music have so far not led to the type of taxonomic rendering that games and
new media in general have been subject to in countless analyses, but one could imagine
such a development. The few essays that attempt to address video game music
aesthetically seem to lack an operative understanding of game type, so one could argue
that an exhaustive catalog of game genres and sub-genres should be paired with a similar
category of musical genres and modes to arrive at a formula for determining how music
functions to accomplish the mysterious immersive effect of successful video games.
Indeed, tentative essays such as David Bessell's chapter in Screenplay seem to suffer
from faulty parallelisms and tendencies toward essentialist conclusions that lack a genre-
informed critique. His essay is a straightforward comparison and contrast of three games,
Alien Trilogy, Medievil 2, and Cool Boarders 2, but his examination noticeably fails to
account for the fact that his subject matter compares games of different genres which
employ different genres of music to accomplish their desired effect of immersion. The
lack of correspondence between the subjects logically allows for uselessly general
conclusions about video game music (for instance, that "video games use music in
different ways"), and one way around this problem would seem to narrowly focus on a
specific game type or music type with specific questions like "How do First-Person
Shooters use orchestral sounds to invoke specific emotions or narrative situations?" but
the taxonomic tendency in video game studies distracts from the potentially more useful
work of game studies and jeopardizes what is at stake in studying games in the academy.
In other words, playing into received notions of "type" invites distracting criticism and
moves the conversations a further step away from the real question of how video game
music "works" in conjunction with the visual and kinesthetic aspects to create an
Therefore while this study moves toward a comprehensive account of the function
of music in games, some necessary exceptions must be made. It is more useful to the
task at hand to restrict the inquiry into music's role to a select few genres. It is hoped that
this will lead to subsequent analyses of related genres and, eventually-following a
coherent genre mapping-bear out the metaphoric/metonymic model as a comprehensive
theory of video game music. Thus, while the implications of this study will be broadly
applicable to the field of game studies, the analysis itself and immediate conclusions must
be necessarily limited to a few genres.
The "Platform"12 and "Survival Horror"13 genres will be the two primary genres I
focus on because I hope to show that the platform game structure established archetypal
12 Wikipedia, an open-source web database of knowledge, contains an entry that defines the Platform genre
as follows: "Tradtionally [sic], the platform game usually scrolls right to left, with the playable character
viewed from a side angle. The character climbs up and down ladders or jumps from platform to platform,
fighting enemies, and often has the ability to gain powers or weapons." Definitions of genres in games are
inherently problematic, but the Wikipedia definition allows me to temporarily avoid definitional debates.
13 A Wikipedia entry defines Survival Horror as "a genre of video game in which the player has to survive
an onslaught of undead or creepy opponents, usually in claustrophobic environments in a third-person
musical patterns for music and relationships to story elements which the survival horror
genre exaggerates to the point of psychosis. The platform genre was the first game format
to adopt exploration of space as the primary metonymic indicator of progression, and
survival horror games use that space again, not as a maze-like puzzle in itself but, again,
as a metaphoric vehicle for the communication of engaged emotional response. Whereas
the platform gamer operates on a success-frustration continuum, survival horror operates
on the more complex safety-fear continuum as an exaggeration of the platform games'
archetypal conflict as communicated by the musical soundtrack.
A typical platform game's music is more closely associated with the structure of
game play-a fundamental safety vs. danger dynamic that propels the player-character
through metonymic progress-but the music presents a narrative of the game's diegetic
character. The survival-horror genre is easily identified by its zombie-filled story content
and mood-inducing music, but successful survival-horror franchises also use sound as a
configurative interface in two key relationships, player/combination (metonym), and
player/substitution (metaphor). For example, in the first Silent Hill game, the player
character carries a radio which emits certain patterns of music-like static when enemies
approach out of the ubiquitous fog. A player must learn to use the sounds to predict the
size, distance, and type of enemies approaching in order to survive. Thus the game's
structure or mode of play is reinforced by the music, and the game designer's decision to
place the music within the diegesis maintains the continuity of the first interface,
narrative/structure. Furthermore, the use of non-diegetic music in the game is at an
eponymous minimum, so the intrusion of the music does not break the frame of the user's
engaged presence in the game world.
Since comparing films with games at all is a controversial position, it is worth
pointing out some of the basic points of similarity from which we can derive a useful
model of analysis. The key fundamental overlap between video games and films is the
fact that film and music fundamentally rely on both aural and visual cues to convey a
sense of a consistent diegesis. Still, it is appropriate to justify this connection further.
Therefore, I intend to discuss theories of film music and cite examples where the
conclusions seem to apply to games as well. Then, through comparative examples, I will
show how the added element of interactivity and the resulting discrete temporal
framework displaces several of the assumptions based on the comparison of game music
to film music. The specific examples will lead to more generic conclusions about how
music operates in specific structures and, finally, will draw on cognitive theories of
immersion and engagement to propose an understanding of "flow" as a mode of
experiencing a text.
Paul Ward proposes an interesting corollary point about games as a form of
animation in that both games and animation strive for a form of representation that is
more exactly termed "emulation" than "simulation" in that both the game and the
animated film rely on similar production techniques.14 Significantly, both the game's
14 Obviously, a game generates animation in real-time reaction to player input while a feature animated film
like Toy Story is rendered in advance, but the underlying technology is similar.
interactive world and the diegesis presented by the animated film respond to the
characters in a manner that can only be believed if it is not realistic. Paradoxically, the
amazement we feel at the level of detail presented in the environment of the characters
may draw us in as a spectacle of technology,15 but the actual dimensions of the
represented world are not dependent on their referent, reality, but on the capabilities and
narrativised goals of the characters. In both cases, animated film and game animation,
timed musical cues and sound effects typically suggest a responsive, narrative specific
environment aimed at either immersing the viewer/user in the spectacle of storytelling or
engaging the viewer/user in the kinesthetic emulation of problem solving in a narrative
The first step in configuring the relationship of film and video games is to look for
similar types of metaphoric and metonymic functions of music in film. Film music in
general follows recognizable patterns of story development which rely on music to create
an emotional effect in the audience, or to provide "navigational" cues which alert the
viewer about plot developments. For example, a common narrative cue occurs when a
mortally wounded character dies. The actor closes his eyes, and the soundtrack supplies
a strong "hit" on a minor chord which indicates that the character has died. The fact that
other characters typically accept this moment as final and do not attempt to revive the
deceased character as one might do in real life is attests to the narrative weight of the
death moment when it is reinforced with a cue from the soundtrack. Cartoon music also
15 Andrew Darley argues that the potential computer games offer for "immersion" (he is using the term in a
slightly different sense than Douglas and Hargadon) is a question of degrees. The technology of computer
games allows for a better or more convincing exploitation of the normal visual codes we have adopted from
earlier media to the extent that computer games can offer a more realistic illusion of being in the space of
the game (163).
relies on metaphoric and metonymic functions of music to impel its effect of emulating
reality, and there is a more fundamental transference of technique between cartoon music
and video game music, as I will discuss below. First, for the association with film music,
it is important to note a few influential theories of film sound and how they relate to the
metaphoric/metonymic functions of music so that similar practices can be observed in
According to Stam et. al. most film-music analysts tend "to distinguish only
between redundant music-i.e., music which simply reinforces the emotional tone of the
sequence-and contrapuntal music-i.e., music which 'goes against' the emotional
dominant of the sequence" (59). This simplified binary is important because it only
relates to the emotional identification of the audience with the subject in the film, and
complications of this system such as Chion's "empathetic music," "a-empathetic music,"
and "didactic contrapuntal music" similarly relates only to the audience's relationship to
the film subject as opposed the audience's recognition of the film's syntactic continuity.
Syntagmatic properties of sound receive little treatment in these musical formulas, but
music is clearly related to the perceived contiguity of several of the identifiable
syntagmas operating in films (Stam et. al. 40). Other analyses identify the functions of
film music as aiding memory and suspending temporality (Cohen "Perspectives" 361),
and it is interesting to note that these functions are identified in the context of cognitive
approaches to film whereas Stam, Chion, and Claudia Gorbman are working within a
structural or semiotics framework that dissects the film as a textual object. My approach
of employing the metaphor and metonym, techniques of linguistics, seeks to combine
both as the textual object of the video game requires constitutive interaction to the extent
that structural or formalist approaches like Aarseth's are insufficient if they ignore the
cognitive operations of piecing the text together-(i.e., beyond the nontriviall"
interaction demanded by the technology (Aarseth Cybertext 1)). The video game's
reliance on input from the user makes it a text whose syntagma must be, in part, "outside"
the game itself.
The difference in the temporal disposition of each medium-film and games-
clarifies, in a sense, the boundary erected by interaction and provides yet another way of
analyzing the immersion vs. engagement continuum. Most importantly, the syntagmatic
function of film music is necessarily distinct from metonymic functions of video game
music, but broader metaphoric functions still apply the same way as in film to the extent
offered by the relationships between the player and the player-character. The difference
is that, the emotional relationship of player to character is not as important as the
relationship of the player to the game by way of the character. The cartoon provides a
stronger candidate as a prior media form to compare video games to because cartoons
also must generate in the audience a relationship with the environment of the cartoons.
Cartoon audiences must suspend a significant amount of disbelief about the emulated
reality of both the cartoon and the game world to perceive the character's actions within
that world as coherent.
Early cartoon music also provides a better comparison to video games because, as
Ward's writing suggests, there is a stronger tie between games and cartoons than has
been recognized. Cartoons rely on music to reinforce the impact of their visuals, so the
relationship of the viewer to the character operates under the redundant mode identified
in film theory (in the "death scene" example, the music restates the narrative action), but
in a different sense. Cartoon music can be emotionally expressive, of course, but its
primary function is as a kinesthetic vehicle such that live-action film is often deemed
"cartoon-like" when musical cues accompany or emphasize violent physical action. The
point may be to provide a humorous counterpoint to the visual of the violence and to
characterize the violence as not hurtful so that we laugh at it (Strauss 8), but Michel
Chion's more poetic description of cartoon sound suggests a more profound involvement.
Describing Tex Avery's What Price Fleadom, Chion writes
... and sound-ineffable and elusive sound-so clear and precise in our perception
of it, and at the same time so open-ended in all it can relate-infiltrates the
reassuring, closed and inconsequential universe of the cartoon like a drop of reality,
a tiny, anxiety producing drop or reality. (122)
Studies of cognition have corroborated this observation with results which suggest that
objects are perceived as alive and exhibiting anthropomorphic behavior when their
motions are accompanied by a synchronized soundtrack (Cohen "Perspectives" 361). The
possibility he describes relates to the paradigmatic or metaphoric function of sound,
whereas for Chion, one of the key functions of sound is to aid in an audience perception
of a spatial diegesis. At the very least, the musical cues and non-musical sound effects
instill objects with even more life than the simple verisimilitude of figures in motion.
This effect is termed-often pejoratively-"mickey mousing,"16 and seems to be at
odds with the "serious storytelling" potential of cartoon music expressed in, for example,
Fantasia. Mickey mousing occurs in both animated and live-action cinema when the
music provides a direct, aural imitation of what is happening on the screen (Neumeyer
and Buhler 6). The telling involved in simple mickey mousing certainly seems to be more
physically or kinesthetically oriented, but in that it represents a characters relation to its
16 or "mickeymousing" in Chion.
fictive universe, mickey mousing corresponds to the metaphoric or paradigmatic axis of
my approach. Music which applies musical cues to physical, slapstick violence also
places the character within the aural transmission of the story diegesis and accomplishes
the emulation of reality that Ward mentions. This exact practice is also used in video
games with only a few differences as the following examples demonstrate.
The earliest uses of the mickey mousing effect occur in classic cartoon works such
as .\/Mw.l,/on Dance. Scored by the legendary Carl Stalling, .\/le/.'ow Dance demonstrates
the complicated blend of diegetic and non-diegetic music and sound that merges to create
an immersive story. The narrative is structured around a group of skeletons performing a
dance routine to an orchestrated song reminiscent of Saint-Saens' Danse Macabre.17 But
the introduction of the story blends music with sound effects to create an eerie
atmosphere. The following image (Figure 2A) shows two cats responding to the sight of
a skeleton rising from his grave. As the skeleton rises, we hear an ascending D minor
scale (Figure 2B) on a stringed instrument-a common figure in cartoon music, and the
cats fright is mimicked in a similar arpeggio (Object 1).
Figure 2. An example of mickey mousing in Disney's The .\ele/li, Dance. A) The
skeletons rise from the grave is synchronized with an ascending scale. B)
Musical approximation of sound accompaniment for A. 1929 Disney.
17 According to Stalling, some writers have even mistakenly said that the music for Skeleton Dance is
actually Danse Macabre (Stalling 39)
Object 1. Sound clip from N\Ale/eii Dance corresponding to Figure 2. WAV file
(objl.wav; 6 seconds; 131kb)
Similarly, as the skeleton begins to skulk about, his footsteps are punctuated with a
staccato harmonic minor scale in D, emphasizing hollow, wooden timbres in the
percussion (Figure 2). As the piece continues-a "foxtrot in a minor key" (Stalling 39)
with quotes from Edvard Grieg's March of the Trolls-the orchestration mimics a
hollow, dry sound one might expect from dancing skeletons by using a marimba or
something similar to carry the melody and accentuate the skeletons' percussive
motions.18 These harmonic choices correspond to metaphoric uses of music to convey a
specific mood in accompanying the visual, but the trajectory of the plot and the
metonymic combination of the visual and aural drive of the piece forward and maintain
its contiguity as an "emulated" event. Diegetic and non-diegetic music19 blend with the
sound to create a specific and compelling mood. The music as an underscore of physical
action soon blends into a choreographed dance number where the skeletons clearly
respond to and produce the musical accompaniment we hear. Thus the "location" of the
music has become clearly diegetic, whereas as the initial mickey mousing gestures are
non-diegetic underscore. The fact that this shift is accomplished seamlessly corresponds
to the metonymic contiguity of the video game in that atmospheric or "tone setting"
sound of the video game's worlds quickly and smoothly give way to didactic or heuristic
implementations that emphasize the video games syntagmatic structure.
18 A marimba is a percussion instrument similar to a xylophone but with wooden keys that are normally
struck with soft mallets. The tone is richer and "warmer" than its metal counterparts, but when a
percussionist uses hard mallets to play the keys, the sound is hollow and dry.
19 Chion and Claudia Gorbman have both drawn up complicated formulas of filmic sound that identify
degrees of origination between the "black and white" analysis of diegetic and non-diegetic sound, but for
the purpose of the present argument, the smooth, unproblematic combination of the two accomplishes the
metonymic combination I associate with video game music.
To further illustrate the difference between the metaphoric and metonymic
functions in cartoon music, another piece scored by Stalling illustrates this same type of
continuity with a different spatial metaphor in play. Galloping Gauchos has an
Argentinean setting, and the music is appropriately reminiscent of the tango (Figure 3).
The same blend of mickey mousing takes place (enacted in this case by Mickey himself)
in the opening sequences. This time, Stalling employs similar ascending figures to
accompany and characterize ascending objects. Figure 4 shows Mickey tossing a
cigarette to impress Minnie, but the lighter timbre of a slide whistle playing a chromatic
glissando match the daylight atmosphere of the event and set the light-hearted tone
associated with Mickey.
Figure 3. Mickey rides into town on an ostrich in Gallopin' Gauchos. The character of
Mickey is bright and friendly, despite the swagger evident in this image, and
the setting of the piece is specific enough that Carl Stalling chose to score this
short predominantly as a tango. 0 1928 Disney.
Figure 4. Mickey flips a cigarette in the air and catches it with his disembodied teeth to
impress Minnie. 0 1928 Disney.
Object 2. Sound clip of cigarette toss, corresponding with Figure 4. Note the difference
in mode and timbre from Object 1. WAV file (obj2.wav; 3.5 seconds; 76.4kb)
These examples illustrate the importance of non-diegetic music and sound to the
communication of cartoon stories. In accordance with the simple redundant/contrapuntal
continuum of film music, the character we are supposed to view as loathsome is "mickey
moused" with predominantly minor or diminished scales and arpeggios while Mickey
Mouse is predominantly narrated with major or diatonic scales. This same principle will
apply to video game music in its metaphoric function, but a perspective from cognitive
psychology elaborates the importance of this audio-visual expression.
Perspectives on Animation and Causality
Exploring the potential for simple shapes and sounds to evoke narrative, cognitive
meaning, Annabel J. Cohen has conducted studies which test subject's interpretations of
certain types of movement into emotional condition as well as the effect musical
accompaniment had on the interpretation of the same moving figures. Her first of several
studies identified musical features that correspond to interpretations along a five-point
happy/sad scale. Specifically, major triads played in different octaves at different speeds
revealed that higher, faster repetitions yielded a higher ("happier") score than lower,
slower repetitions (Cohen 362). The fact that such a simple sound system could correlate
so strongly to an emotional scale hints at the complex emotional interpretations of
harmonies, chords, and key changes. Such complexities would require a more elaborate
emotional model, and the results would, no doubt, vary more for each individual listener,
specifically across cultures and musical conditions.20 At any rate, Cohen's studies
20 In response to this problem, musicologists attempting to deal with narrativity in instrumental music adopt
semiotics as a framework. Jean-Jacques Nattiez (cf. footnote 10) uses Claude Levi-Strausse as a starting
point, for example.
suggest similar conclusions to Alan Leslie's: that emotional interpolation may be
inherently part of interpreting sensory information.
Cohen also led studies which tested the correlation between visual and aural stimuli
by asking subjects to use the same five-point scale to comment on a simple animation of
a bouncing ball. The ball's movements matched the triads, moving up and down at
slower or faster rates and at higher or lower positions. Accordingly, "low, slow bounces
were judged as sad, and high, fast bounces were judged as happy" (Cohen 362). When
the two stimuli are combined, the results are consistent with either the motion or the
music, but when the two diverge, the musical accompaniment was shown to influence the
interpretation of the visual. A slightly more involved study subsequently experimented
with the affective meaning of story interpretation.
Using shapes that again were generally perceived as two lovers escaping a bully,21
two soundtracks were tested for their effect on viewer's interpretation of the scene.
There were differences; specifically one "character" was seen as more active when
viewed with a soundtrack which expressed temporal congruence to "his" movements
(Cohen 363). This apparent association led Cohen to develop the "Congruence-
Associationist framework" which holds that "through structural congruence, music
directs specific visual attention and conveys meaning or associations" (Cohen 370).
In Actual Minds/Possible Worlds Jerome Bruner mentions several studies of the
perception of causality that were performed by cognitive psychologists seeking to
determine if perceiving causality is an innate or learned feature of understanding.
21 Cohen makes no specific reference to Heider and Simmel, and the specific shapes involved are different.
It may simply be that the archetypal love story line is simply one of the most basic, universal stories we all
tell and experience.
Michotte demonstrated that "when objects move with respect to one another within
highly limited constraints, we see causality" (Bruner 17, emphasis in original). Further
studies-Alan Leslie's, Fritz Heider and Marianne Simmel's-indicate that we also see
intentionalityy" and that the ability or desire to interpret information as essentially a story
may be fundamental or automatic from birth (Bruner 18). One can draw many interesting
conclusions from this type of study, notably the implied anthropomorphism of simple
objects that we see as exhibiting intention, but the implications are clearly that just about
anything can be a story.2 Annabel Cohen carried these studies in a different direction by
addressing the kinds of stories we make out of the perceptions we have.
Heider and Simmel tested subjects' interpretation of a series of moving shapes on a
blank background. According to Bruner, the test subjects invariably interpreted the scene
as "two lovers being pursued by a large bully who, upon being thwarted, breaks up the
house in which he has tried to find them" (Bruner 18). It may be that the testers
intentionally modeled their moving shapes after every episode of"Popeye," or it may be
that certain elements in that film, such as the proximity or similarity of the two "lover"
shapes led to certain, inherent conclusions.
With this cognitive framework as a tool, one can begin piecing together the
cognitive functions and semiotic interactions that compose the interrelation of visual and
aural elements which create meaning in cinema, cartoons, and video games. The
congruence-associationist framework also provides a way of discussing the
22 In her playful but insightful Picture This: Perception and Composition, Molly Bang attempts to tell the
"little red riding hood" story with as few shapes as possible. A small red triangle represents the main
character, for example. Through running commentary, Bang explains how the proximity and relative sizes
of other shapes, their colors and location on the page affect the sense of the story. Her "ground up"
approach nicely demonstrates some of the conclusions of Heider and Simmel's studies of causality.
phenomenological difference between what happens when we watch movies and when
we play video games. Much work in film studies already assume the kind of correlation
that Cohen and her colleagues found to be a cognitive function, and a system of
conventions have developed these pre-existing schemas for musical narration.23
Applying similar findings to specific video games in the context of metaphoric and
metonymic functions of music will show the semantic operations of music and sound in
In this section, I will analyze specific video games that exemplify metaphoric and
metonymic functions of music, but it is important to note that these examples are not
meant to implicate all genres and classes of video games. The three games I will focus
on here, Super Mario Brothers (SMB), Legend of Zelda: Ocarina of Time (LZ:OT), and
Silent Hill(SH) were chosen for their strong narrative component and because they
provide ready examples of the types of correlations I intend to draw with cinematic
conventions of music. However, as my intent is not to develop a totalizing view of video
game sound, certain game genres will not lend themselves as easily to this present
analysis.24 Furthermore, while the first two examples are a bit dated, they are not meant
23 Historically (at least since the 19th century) there has been a divide in classical music between "Absolute"
and "Program" Music. Program music like Smetana's The Moldau depict non-musical pictorial settings or
events; The Moldau musically traces the journey of the Moldau river in the Czech republic, and Berlioz's
Symphonie Fantastique is an autobiography of sorts. By contrast, absolute composed as music for music's
sake or "music composed with no extra musical implications" (Alfred's Pocket Dictionary of Music 9).
Nattiez argues that the semantic possibilities and temporal frame of music permit narrative approaches to
music (Nattiez 244), but such approaches must recognize that music alone relies primarily on syntagmatic
24 am thinking here of the Si ll' category of video games (SimCity; The Sims), but the rhythm genre of
games (Dance Dance Revolution) poses similar challenges for opposite reasons.
as archetypal or foundational instances of music in games though they do indicate
significant accomplishments in video game music.
Super Mario Brothers
In 1985, Nintendo of America released what would become arguably the most
influential console game Super Mario Brothers. The side scrolling Platform game would
spawn several spin-offs through, so far, four generations of consoles and dozens of rip-
offs inspired by Mario's success. While it would be an oversimplification to say that
Super Mario Brothers is important simply because it initiated video game tropes like
power-ups, extra lives, and a metaphor of geographic expansion conveyed by progress
through progressively difficult levels (Poole 42), SMB is, like .ke/l,,ni Dance, an
opportunity to examine important aspects of its respective medium. More importantly,
SMB provides a ready example of musical functions borrowed from animation at an early
stage in video games' development. Specifically, Mario's (or Luigi's) movement on the
screen is accompanied by a musical mickey mousing gesture. In line with Koji Kondo's
peppy theme music, Mario's "jump" (Figure 5a) is accompanied by an ascending
chromatic glissando (Figure 5b). Like Mickey's cigarette toss (Figure 4), Mario's leap
has a pleasant sound (i.e., it does not use minor or diminished intervals), not only because
we are supposed to identify favorably with Mario, but also because a typical game player
will likely hear the same sound repeated hundreds of times in a dedicated period of game
play.25 The mickey mousing effect is also intended to emphasize the physicality of Mario
and his kinesthetic involvement with his environment. In Figure 6a, Mario has "powered
25 Mario's characteristic jump is also historically significant because he was the first player-character to use
jumping as his primary means of both exploration and combat, so much so that his original name in pre-
Mario iterations like Donkey Kong's "carpenter character" was "Jumpman" (Poole 42).
up" to Super Mario, so, as Figure 6b demonstrates, the sound effect of his jumping is
mimicked as the same musical figure an octave below the original. Other movements
and collisions in the game respond to Mario in a way that enhances the impact of the
represented on-screen events. In this case, the musical mickey mousing is in tune with
the metaphoric creation of a believable game world, one which is characterized by the
non-diegetic theme music.
Figure 5. Regular (small) Mario jumping produces a musical phrase which is repeated
continuously as one plays the game. This effect is clearly derived from
mickey mousing in film and cartoons. A) Mario jumping. B) Approximate
musical notation for Mario's jump effect. Super Mario Brothers. (Nintendo
Entertainment System) 1985 Nintendo of America Inc.
Object 3. Sound clip of "small Mario" jump effect. WAV file. (obj3.wav; 3.5 seconds;
Figure 6. When "Super Mario" jumps, his mickey mousing effect becomes exactly one
octave lower. This illustrates the kinetic, emulated physicality of mickey
mousing which enhances the sense of the character's body in space. A) Super
Mario jumping. B) Approximate musical notation for Super Mario's jump
effect. SMB. (NESr) 1985 Nintendo of America Inc.
Object 4. Sound clip of Super Mariojump effect. WAV file (obj4.wav; 3.5 seconds; 129
Similarly, music and sound effects serve a syntagmatic or metonymic function in
encouraging successful game play by providing positive reinforcement as consequences
for actions in the game. "Dying" in Super Mario Brothers (Figure 7) produces an
arresting staccato pulse followed by a conciliatory musical cadence reminiscent of the
music one hears upon misestimating the value of a vacuum cleaner or dish set on The
Price is Right. The music is a descending figure, mimicking Mario's ejection from the
playing field. The music is a coded message of failure, but similar messages of success
reinforce the successful completion of levels in the game, and, on a smaller level, the
satisfying "ching" of collecting gold coins reinforces the behavior which is also
strategically advantageous to advancing in the game. Considering an entire level as
musical composition, "death" or "success" musical messages serves as a cadence to that
world's musical structure. Bowen's analysis of Atari 2600 games as musical structures
identifies death music as a cadenza to the rhythmical music of the game's sound effects
(Bowen n.pag.). In these ways, music works at a syntagmatic level across a musical
structure to encourage the user's continued play. The game's syntagmatic structure is
dependent on user input, so music that engages further participation can be said to
function metonymically toward the continuity of the game play experience.
Figure 7. Mario "dying." SMB. (NESr) 1985 Nintendo of America Inc.
Object 5. Sound clip of "failure" cadence. The phrase of music a player hears each time
Mario dies is in the same key as the "Overworld Theme" and provides a solid
musical transition from one trial to the next in the trial-and-error pattern--a
central part of the experience of Platform games-while avoiding the finality
of the dirge-like Game Over music. WAV file. (obj5.wav; 3 seconds; 58.4kb)
In a different role, Kondo's "Overworld Theme" (Object 6) has been described as a
funk or jazz tune "but with so much energy pumped into each articulated note, one is not
sure whether it invokes cheesy Vegas lounge music or a Dixieland band" (Belinkie
n.pag.). This sunny-sounding tune is heard only in areas of the game world (the
Mushroom Kingdom or Overworld) where the level is above ground (Figures 5A and
6A). Transporting via tunnel to the underworld (Figure 8), one hears the "Underworld
Theme" (excerpt Figure 9) which modulates to the key of G minor and has a hollow,
eerie feel. Also, though the key of the piece is scored at G minor, the melody lacks a
tonal center (i.e., it never comes to rest on the tonic, G) and relies on tense chromatic
passages. The chromatic tone clusters contribute to the feeling of enclosed
claustrophobic space of the underworld, and the lack of tonal center conveys the
disorientation often felt in underground spaces.
Object 6. Musical excerpt from "Overworld Theme." WAV file. (obj6.wav; 20 seconds;
Figure 8. In the Underworld the music changes to match the shift of location that has
occurred in the story-line. SMB. (NESt) 1985 Nintendo of America Inc.
1 J=100 A 3 f 4
Figure 9. Excerpt from arrangement of Underworld theme for piano. Arr. Brian Auyeung
Object 7. Sound clip from Underworld. Note how the lack of tonal center and use of
minor tone clusters accentuates the loneliness and "angularity" one can
imagine feeling under ground. WAV file (obj7.wav; 12 sec.; 275kb)
Other areas of the game world have their own musical signature as well. Figure 9
shows the Musical accompaniment for the underwater stages, a lilting, somewhat
peaceful waltz. These basic themes so far characterize the environment ofMario's
world. They allow us as listeners to make certain predictable association with types of
melody-major vs. minor is just the simplest identification one could make-but they
also signify to the player that the world itself is static in that the music repeats on a loop
until the global danger state changes. A player must successfully complete a level within
a time limit, 300 seconds, and the music provides a motivational cue as time is running
short to encourage the player to complete the level. The music remains in the same key,
but doubles its tempo, adding a sense of urgency to the mood of the environment. This
cues acts as a heuristic or metonymic device, and it breaks the frame of immersion
encouraged by the repeating loop that plays through most of the level. The music is then
shifting into a mode of engaging a player's response by calling him to faster or more
skillful interaction with the game. Similarly at the paradigmatic stratum, the syntagmatic
structure of music as metonym again appears in the tensely chromatic score for the castle
The fourth "level" of each "world" is set in the interior of a castle, and build up to
an ultimate battle with a boss character, subordinate manifestations of Bowser. This
music is similar to the Underworld theme in its lack of tonal center and reliance on
chromatics. Here, the confined space afforded to the player (Figure 10) is mirrored in the
dense cluster of notes that carry the theme (Figure 11).
Figure 10. The space of the castle levels is even further compressed than the Underworld
levels. SMB. (NESM) 1985 Nintendo of America Inc.
Figure 11. Approximate score for castle level. Significantly, the notes on the staff mimic
the compression of the game's space in their density. Arr. Brian Auyeung
Object 8. Sound clip of music in a castle level. WAV file. (obj8.wav; 16 seconds; 342kb)
Therefore the music acts metaphorically as an indicator of mood and environment,
and it acts metonymically as a structural device to engage the player's continued
involvement and action in the game. The paradigmatic shift of environments is signaled
and accompanied by shift in musical mood, and the syntagmatic contiguity of the game
as an assemblage of several different world-types is maintained by the didactic effect of
HI I1; ;Ii 1J1 J* -I I_ IL I J
MAPIO WOPLD TIM
053200~i >B4 1- 4
"""" EMO .1.11EM ff""=~iiiii.M-ii.i~ Q
the music as motivation. The music's tempo is all that moves it from a paradigmatic role
to its syntagmatic function.
Legend ofZelda: Ocarina of Time
Another game that extends these somewhat archetypal musical patterns is Legend
of Zelda: Ocarina of Time. This game extended the popular and influential Legend of
Zelda series into the 3-dimensional world made possible by the Nintendo 64 console.
Like Super Mario Brothers, Ocarina of Time employs music to function as both
metaphoric and metonymic devices, but the complexity of the musical score and the real-
time blending allowed by the game engine creates a more lush, cinematic feel. Composer
Koji Kondo again uses particular melodic themes to identify specific areas of the game
world in something like Wagner's leitmotifs acting in reverse.26 Furthermore, Ocarina of
Time employs music directly as a heuristic device to further game play in that players
must successfully memorize short musical themes which enable special areas or abilities.
Ocarina of Time 's genre is not as straightforward even as SMB. It is usually
classified under "Adventure" games (or the unhelpful Action/Adventure categorization),
but it clearly has elements from the Platform genre (jumping to solve puzzles, exploring
space, defeating "bosses" to complete areas of the game) and the Role-Playing Game
(RPG) genre (keeping track of and purchasing items, using a map, and "leveling up"
one's character). The setting of the game is clearly one of Fantasy in that one encounters
elves, fairies, wizards, and humans uneasily coexisting in a world powered by magic and
26 Many of Wagner's operas assign a musical "signature" to characters that the audience hears when that
character appears on stage. Leitmotifs can also interact with one another to mimic the tension of the drama,
but one of their purposes is to help the audience identify characters as they enter the stage. The fixed,
stationary audience witness a very large numbers pass through the space of the stage in a Wagnerian opera,
but in The Ocarina of Time, the audience travels and the leitmotifs are attached to the stationary
environments of Hyrule.
potions. The tone of the story and visuals are also more serious than Mario's.
Accordingly, the game's mickey mousing effects mostly become realistic sound effects.
The player-character, Link, does undergo a jumping sound effect, but the musical
ascension is replaced by an aggressive grunt. Collecting coins has a similar "ching"
which is, by now, universal in games which involve collecting coins, and success in the
game is similarly reinforced by a musical "reward." Ocarina of Time thus enacts the
same paradigmatic and syntagmatic structures as Super Mario Brothers but the
complexity of the paradigmatic structure and the use of music as an actual heuristic
device involved in game play allow the identification of some more intriguing metaphoric
and metonymic operations through the eponymous ocarina.
Figure 12. Playing the ocarina in The Legend of Zelda: Ocarina of Time. 1998
Nintendo of America Inc
Link's most important item is his Ocarina, which a gamer must learn to "play" with
the controller. In "ocarina mode" (Figure 12), a player presses keys that correspond to
notes on the potato-shaped instrument. Figure 13 shows the basic 5-note scale one needs
to unlock key melodies, though additional manipulation from other control buttons makes
it possible for a skilled player to reproduce a complete scale.27 Successfully playing a
27 At least one web site offers instruction in playing and composing with the ocarina through the controller
("64Zelda Music Studio"
melody fragment unlocks an animation which completes the melody and performs the
specified action where appropriate. Not only do these musical themes flavor the
experience of play, they are also reproduced in the backgrounds of several of the games
i I- r
Figure 13. Scale of base Ocarina note positions
Figure 13 shows the melody that must be played to perform "Saria's Song" which
permits teleportation to the Lost Woods. In the Lost Woods, the constantly running
theme music (Object 9) extends and elaborates Saria's song in a small-scale, looping
"theme and variations" structure. Thus the syntagma of the musical heuristic merges
with the paradigmatic axis of the Lost Woods' theme. The Temple of Time Theme also
replays the "Song of Time" in a chorale effect mimicking a cathedral's echoing
dimensions. The importance of this blending of metaphoric and metonymic functions is
also significant in both of these cases because the player hears a melodic figure repeated
in the orchestrated underscore that Link will have to "hear" at a later time to use the
ocarina to unlock the appropriate power. The powers of the melodic fragments cannot be
unlocked until the player has reached the appropriate moment in the game, so the
paradigmatic atmosphere music also acts as melodic foreshadowing to the extent that
often goes unrecognized and a players report feelings of deja vu as the melodies they
learn have an eerie familiarity.
Figure 14. "Saria's Song" from LZ.OT
Object 9. "Lost Woods Theme." Listen for the repetition and variation of"Saria's Song."
WAV file (obj9.wav; 20 seconds; 500kb)
The significance of Ocarina of Time's musical score goes beyond the subtle
interactions of foreshadowing and heuristic, however. The game engine's sophistication
is such that, for the first time, musical phrases can blend seamlessly as Link crosses one
sonic area into another and, more importantly, as Link encounters a dangerous enemy.
Object 6 is a clip of what happens musically as Link approaches an enemy. The effect is
initially subtle, but blossoms into full-blown "attack" music which, much like the Castle
Theme from Super Mario Brothers, heightens the drama of the conflict and alerts the
player to more focused interaction. The sound engine of the Zelda game demonstrates
the same principle of maintaining contiguity, but the role it plays is somewhat different
since the 3-dimensional construction of Link's environment often allows a player to
choose whether or not to approach the source of the "danger music," but the same
overarching structure holds true when Link encounters level bosses and the final enemy,
Ganondorf. The application of this safety/danger binary in the fluid schematic of the 3-
dimensional space of Hyrule (Princess Zelda's Kingdom) exhibits the complexity and
richness of the simulated environment. The paradigmatic soundtrack of the game is both
charming and haunting, and the complexity of the blending and overlapping musical
themes invite serious immersion in the game world, but, again, the danger to the
character that the "danger music" signifies threatens to disrupt the immersion of the story
and forces the player to engage the game as an active participant. The successful
experience of both dimensions, therefore, approaches an ideal flow state in play.
Object 10. LZ:OT"Danger" theme-the blending of safety state/danger state musical
metaphors. WAV File (obj 0.wav; 44 seconds; 946kb)
So far, the two games I have discussed have demonstrated that fulfilling the
metaphoric and metonymic functions of video game music often amount to two
overarching modes of video game music, safety state and danger state. Each musical
type is a "state" because individually they are simply substitutions, one for the other,
along the paradigmatic axis, but in the act of transferring from one to the other, the
syntagma of the game requires interaction from the user to maintain contiguity. A
different category of music signals these metonymic moments, and reinforces player
interaction with "reward" music or "failure" music. Overall, the trajectory of these
semantic impressions the music creates is directed by the music scale being major or
minor. Major passages relate "safety" or "reward" states while minor or diminished
chords signify "danger" or "punishment." Other complexities are clearly involved, but
this simple division suffices for the present argument.
In Survival Horror games, the syntactic structure is for the most part the same, but
the musical choices are not as straightforward. The classic Silent Hill has a rich and
varied soundtrack, but there is no music in a major key. In fact, the "safe state" is not
present at all in the same sense, so the music never settles on or appears to move toward
any kind of resolution. This is in part because the play of Survival Horror games
(generically derivative of the "Adventure" genre) is not punctuated with the same rhythm
of trial and error attempts at a skilled task. The dominant problem solving mode of
Adventure and Survival Horror games is puzzle solving, and the Survival Horror game is
unique in that armies of zombies and other undead creatures block the path to puzzles'
solutions. Other games with monstrous enemies, Doom, for example, require the simple
annihilation of enemies, but the limited ammunition and inefficient camera angles that
define Survival Horror make avoiding enemies as much of a priority. Therefore, musical
scores like Akira Yamaoka's for Silent Hill never have the safe moments of exploration
offered by platform games, and they must sustain a consistent and pervasive mood of
terror or apprehension in the player. The Adventure genre format calls for exploration as
the primary user input, so in Silent Hill the music is always in a degree of "danger state"
in order to compel the player through the game's spaces. The mood of the game is
crucial to the horrific "feel," but it is also, therefore, enacting a metonymic function by
compelling continual progress through the game. The town of Silent Hill is never a safe
place, so players maintain the game's contiguity by trying to escape Silent Hill which is
an embodiment of the musical danger state. In general Survival Horror games rely on
conventions of horror film sound to effectively create the mood of horror required for the
game, but the trajectory is slightly shifted. Neumeyer and Buhler write that
In suspense films, subjective crisis and psychological rupture are often prominent
themes, with the character experiencing a debilitating loss of centre, which is figured
musically by the absence of a tonal centre. In horror films, the monster often embodies a
kind of dystopian projection, a means of figuring unintended consequences of the system,
which take musical shape as tonality gone awry to the point of incomprehension
(Neumeyer and Buhler 23).
Silent Hill does offer a "debilitating loss of centre" for the main character, Harry
Mason, and the music is significantly atonal, often eschewing melody at all and utilizing
a percussive "industrial" sound. But the environment itself is the site of "dystopian
projection," more so than any of the actual monsters. As Figure 13A and 13B illustrate,
the space of Silent Hill undergoes rapid physical change from a foggy, empty town that is
otherwise normal to a blood-soaked, nightmarish parody of the same space. This change
is always reflected musically as the quietly unnerving throb of the foggy Silent Hill gives
way to a cacophonous ringing of metallic noises and atonal chaos. This musical chaos is
the only cue to reflect the player-character's psychological state; he is nearly always
facing away from the camera, and the pre-rendered cut scenes and voice-overs are
delivered in as dull a voice-acting performance as any in recent memory.
Figure 15. "Normal" school building. Silent Hill 1999 Konami, Sony Computer
Object 11. Sound clip of 'normal' school (Figure 15) with a basic, ambient soundtrack.
WAV file (obj 1.wav; 23 seconds; 501kb)
Figure 16. Same space in the school-radically altered. Silent Hill 1999 Konami, Sony
Computer Entertainment Japan
Object 12. Sound clip of altered school with aggressive, threatening soundtrack. WAV
file (obj 12.wav; 18 seconds; 378kb)
Survival Horror games in general do often use the same formulas of music as
classic horror films, but in the context of the metaphoric and metonymic functions of
video game music, the semantic alignment of the music is slightly altered. Specifically,
silence is often employed in films to create a sense of building tension, and in the
dynamic of paradigmatic game music, silence, when used, is equivalent to the safe state,
though it accomplishes an opposite effect.
Another Survival Horror title Resident Evil: Code Veronica (RE:CV) typifies this
displacement of the safe/danger binary. Figure 17A shows the player-character, Claire,
exploring a hallway in the opening sequences of the game. There are no enemies, so non-
diegetic music is silent. The next scene initiates an encounter with zombies (Figure 17B),
and enacts the familiar danger state accompaniment of rhythmically intense music in a
diminished or minor key. The context of the Survival Horror genre associates this game
with horror film such that the silence of the first scene puts the player on edge rather than
reassuring him that there is no danger in the immediate environment increases the
expectation that danger will soon appear. The appearance of the danger is, therefore,
heightened in intensity by way of its sudden intrusion into silence.
Figure 17. Silence vs. "danger music" in RE:CV A) Silent exploration B) Dramatic
zombie attack. Resident Evil: Code VeronicaX 2001 Capcom U.S.A. Inc.
These moments from the opening sequences of Code Veronica are the first chance
for the player to encounter and deal with forces of the undead, but Silent Hill's opening
sequences reveal a different approach that also reveals an allegiance to horror film uses of
sound. In Kubrick's The .\/iiing, for example, the musical will often rise steadily to a
cacophonous crescendo to match a character's escalating terror or psychosis, and in Silent
Hill a similar effect is created by overlapping musical sequences that are cued as "event
triggers" when the player enters progressively horrific areas.
The opening cut-scene of Silent Hill provides the set up for the story, which has to
do with Harry Mason taking his daughter, Cheryl, on a vacation to the resort town of
Silent Hill. After a mysterious accident en route, Harry awakes to find himself alone in a
mysteriously foggy and strangely empty Silent Hill with no sign of Cheryl. The music is
faint, mostly atmospheric ambience that matches the foggy streets with a "swooshing"
sound or a low throb. Harry hears footsteps, and, in one of the eeriest sequences in any
video game, a player must follow a shadowy figure-who may or may not be Cheryl-
who always stays just beyond the edge of visibility. The figure eventually leads Harry
into an alley, which enacts the sequence of images and sound clips in Figures 18 21.
The forced camera angles cause the point-of-view to careen wildly as Harry enters
different rooms of the alley, and as the alley way becomes suddenly darker, Harry's terror
(and the player's) is both reflected and dictated by the soundtrack growing in volume and
Figure 18. "That's strange. It's getting darker." Silent Hill 1999 Konami, Sony
Computer Entertainment Japan
Object 13. Sound clip corresponding to Figure 17. Note the "air-raid siren" sound effect
which has increased its volume significantly from its minimal presence in the
basic ambience. WAV file (obj 13.wav; 21 seconds; 457kb)
Figure 19. Further down the alley. Silent Hill 1999 Konami, Sony Computer
Object 14. Sound clip to accompany Figure 18. Note the percussive "industrial" sound
effect. WAV file (obj 14.wav; 21 seconds; 455kb)
Figure 20. Further still. Organ sound seems to trigger when Harry steps over puddle of
blood. Silent Hill 1999 Konami, Sony Computer Entertainment Japan
Object 15. Sound clip to accompany Figure 19. The ascending organ sound begins to
offer a sense of key or tone, since all the sounds so far have been industrial or
percussive, and this organ sound is the first "real" organ, but this organ line is
decidedly atonal its semblance of a melody. WAV file (obj 14.wav; 20
Figure 21. End of the alley. "What's going on here?" Silent Hill 1999 Konami, Sony
Computer Entertainment Japan
Object 16. Final sound clip from alley sequence. The grunting or wheezing sounds in
the clip are produced by the child-like zombie-creatures. WAV file
(obj 5.wav; 30 seconds; 658kb)
At each successive stage of the alley, the visuals become more nightmarish, and at
each stage represented above in Figures 17 21, a new voice is added to the soundtrack.
Finally, after passing by a few ominous hospital implements and discovering what
appears to be a flayed and crucified human corpse, Harry is trapped inside a room with a
pair of child-like, knife-wielding zombies. The player has control over Harry, but since
Harry has no weapons, is powerless to fight back and can only run away from the
creatures in a tight space. In a horrifying moment, the creatures attack and appear to
chew on Harry, and the player must watch helplessly. The anxiety of this moment is
heightened by the gruesome visuals, the sound track, and by the standard video game
trope of character-death. The consequence or punishment in an Adventure game for
allowing the player-character to die is being forced to repeat material that has already
been explored, and since the overarching, eponymous goal of Survival Horror is to
survive, actual character-death may only occur a handful of times throughout playing
Silent Hill. The music that drives the growing terror of this alley sequence leads to a
simulated death (i.e., Harry does not really "die" in the game; this scene leads to a pre-
rendered cut scene of Harry waking up in a diner wondering if what just happened was a
dream) builds on a filmic technique of building suspense, but the musical metaphor of the
sequence mimics the visuals of the environment, the "embedded internal" experience of
Harry, and our own emotional response as the player because the music is non-diegetic.
That is, the musical underscore seems to happen "outside" of the world of the story as a
device to charge the emotional response to the sequence. The music is, therefore, acting
symbolically from Harry's point of view in that he does not "hear" it, but another feature
of the game, unique to Silent Hill, suggests a more complicated possibility for the
diegetic/non-diegetic question of musical origination.
Harry is eventually equipped with weapons to fight against the various creatures
that he will encounter as he proceeds through his quest to locate his daughter, but his
most important tool is a "broken" radio that emits sound of a recognizable frequency
whenever a monster is near. The claustrophobic player perspective and ubiquitous fog or
darkness make hearing more important to successful game play than seeing. Once a
player is used to the system, she can use Harry's targeting ability to automatically aim at
the nearest enemy, whether it is on screen or not, upon hearing the specific noise emitted
by the radio. Since most of the enemies will approach from above or behind Harry, a
player may not ever see certain enemies, and since the sounds appear gradually and swell
to a crescendo as the monster gets nearer, the effect works on the same principle as the
alley sequence in the opening of the game. Since this is also a strategic device built into
the game, and because it merges with the soundtrack though its source is visibly present
in the game environment, the radio's sounds enact a metonymic musical function that
amounts to a syntagmatic unification of the game play experience. The radio sound is
crucial to game play, so the syntagma is a metonym enacting player engagement.
By combining conventions of both video game and horror film, the creators of
Silent Hill create an experience that is driven musically by the grotesque exaggeration of
metaphoric functions. The syntagmatic structure is "shifted" toward a psychotic effect by
the removal of the "safety state" syntagma, and the metonymic functions operate through
an arresting juncture between diegetic and non-diegetic sound.
In this thesis, I have sought to explore various applications of a
metaphoric/metonymic model of video game music in specific video games, but I have
attempted to avoid making any universal claims about the general or absolute nature of
the phenomenon across all genres of video games. While the basic safety/danger binary
can be witnessed in many genres, the metaphoric use of music to create a sense of space
is limited to genres that already lend themselves to a narrativistic interpretation. My use
of the linguistic model of the paradigmatic and syntagmatic axes may prove useful in
other analyses that investigate the semantic properties of video game narrative, but the
implications of this theory in regard to music are, perhaps, most easily applied to the
question of immersion, engagement, and flow. Douglas and Hargadon write of the "Fifth
Business" as "the agent who exists solely to chivvy the characters and plot toward its
conclusion" (Douglas and Hargadon 200, 201). I disagree with their conclusion that an
anthropomorphized agent would better serve users of interactive like Adventure games
which often require complex or obscure puzzle solving scripts. But it seems that music is
one of the ways video games help balance out the tension between immersion and
engagement that arises in an environment that involves both story and algorithm.
By simultaneously enriching the worlds of video games and assisting the player's
navigating the syntagmatic structure of video games, music is essential to the semantic
operations of a video game as an interactive story.
LIST OF REFERENCES
Aarseth, Espen. "Computer Game Studies Year One." Game Studies. 1.1 (2001): 31
-. "Genre Trouble." First Person: New Media as Story, Performance, and Game. Eds.
Noah Wardrip-Fruin and Pat Harrigan. Cambridge: MIT Press, 2004.
-. Cybertext: Perspectives on Ergodic Literature. Baltimore: Johns Hopkins University
Absolute Music. Alfred's Pocket Dictionary of Music. Comp. and Ed. Sandy Feldstein.
Sherman Oaks, CA: Alfred Publishing Co. 1985.
The Animatrix: WorldRecord. Dir. Takeshi Koike. Perf. John Wesley, Victor Williams.
2003. DVD. Warner Home Video, 2003.
Bang, Molly. Picture this: Perception & Composition. Boston : Little, Brown, 1991.
Belinkie, Matthew. "Video Game Music: Not Just Kids Stuff." Video Game Music
Archive. Online. 15 December 1999. 29 March 2004.
Berlioz, Hector. Symphoniefantastique, op. 14. The Philadelphia Orch. Cond. Richard
Muti. EMI Classics, 1999.
Bessell, David. "What's that Funny Noise? An Examination of the Role of Music in Cool
Boarders 2, Alien Trilogy, and Medievil 2." Screenplay:
cinema/videogame/interface. Eds. Geoff King and Tanya Krzywinkska. London:
Wallflower Press, 2002.
Bruner, Jerome. ActualMinds, Possible Worlds. Cambridge, Mass. : Harvard U P, 1986.
Capcom. Resident Evil, Code: Veronica. (PS2). Sunnyvale, CA: Capcom, 2000.
Chatman, Seymour. Story and Discourse: Narrative Structure in Fiction and Film.
Ithaca: Cornell U P, 1978.
Chion, Michel. Audio-Vision: Sound on Screen. Trans. and Ed. Claudia Gorbman. New
York: Columbia U P, 1994.
Cohen, Annabel. "Film Music: Perspectives from Cognitive Psychology." Music and
Cinema. Eds. James Buhler, Caryl Flinn, and David Neumeyer. Hanover, NH :
University Press of New England, 2000.
--. The Functions of Music in Multi-Media: A Cognitive Approach. Proc. of Fifth Annual
Conference on Music Perception and Cognition, Aug. 1998, Seoul National U.
Seoul: Western Music Research Institute, 1998.
Darley, Andrew. Visual digital culture: surface play and spectacle in new media genres.
New York : Routledge, 2000.
Dor, Joel. Introduction to the reading ofLacan : the unconscious structured like a
language. Eds. Judith Feher Gurewich and Susan Fairfield. Northvale, N.J. : J.
Douglas, J. Yellowlees and Andrew Hargadon. "The Pleasure of Immerstion and
Interaction: Schemas, Scripts, and the Fifth Business." First Person: New Media as
story, Performance, and Game. Eds. Noah Wardrip-Fruin and Pat Harrigan.
Cambridge: MIT Press, 2004.
Erard, Michael. "The Ivy-Covered Console." New York Times. 26 February 2004: G1.
Eskilinen, Markku. "Towards Computer Game Studies." First Person: New Media as
story, Performance, and Game. Eds. Noah Wardrip-Fruin and Pat Harrigan.
Cambridge: MIT Press, 2004.
Flitterman-Lewis, Sandy. "Psychoanalysis, Film and Television." Channels ofDiscourse,
Reassembled: Television and Contemporary Criticism. 2nd Rev. Edn. Chapel Hill: U
of North Carolina P, 1992.
Frasca, Gonzalo. "Simulation Versus Narrative: Introduction to Ludology." The Video
Game Theory Reader. Eds. Mark J.P. Wolf and Bernard Perron. New York:
"Galloping Gauchos." Dir. Ub Iwerks. 1928. Walt Disney Treasures -Mickey Mouse in
Black and White. DVD. Walt Disney Home Video, 2002.
Grieg, Edvard. Lyric Suite, Op. 54: No. 4: "March of the Trolls." Cond. Leonard
Bernstein. Sony, 1994.
Jakobson, Roman. "The Metaphoric and Metonymic Poles." Metaphor andMetonymy in
Comparison and Contrast. Cognitive Linguistics Research 20. Eds. Rene Dirven
and Ralf Porings. New York: Mouton de Gruyter, 2002.
Juul, Jesper. "Introduction to Game Time." First Person: New Media as story,
Performance, and Game. Eds. Noah Wardrip-Fruin and Pat Harrigan. Cambridge:
MIT Press, 2004.
Konami. Silent Hill. (Playstation). Redwood City, CA: Konami, 1999.
Mio T. "64Zelda Music Studio." 12 December 2000. 2 April 2004.
Montfort, Nick. "Notes from Form, Culture, and Video Game Criticism." [Weblog entry.]
Grand Text Auto. Georgia Tech. 6 March 2004. 29 March 2004.
--- and Noah Wardrip-Fruin Eds. First Person: New Media as story, Performance, and
Game. Cambridge: MIT Press, 2004.
Morris, Sue. "First-Person Shooters A Game Apparatus." Screenplay:
cinema/videogame/interface. Geoff King and Tanya Krzywinkska Eds. London:
Wallflower Press, 2002.
Nattiez Jean -Jacques; Katharine Ellis. "Can One Speak of Narrativity in
Music?"Journal of the Royal Musical Association. 115: 2. (1990), 240-257. Online.
JSTOR. 2 April
Nell, Victor. Lost in a book : the Psychology ofReading for Pleasure. New Haven: Yale
University Press, 1988.
Neumeyer, David and James Buhler. "Analytical and Interpretive Approaches to Film
Music (I): Analysing the Music." Film Music: Critical Approaches. Ed. K.J.
Donnelly. New York: The Continuum International Publishing Group, 2001.
Nintendo. The Legend of Zelda: Ocarina of Time. (Nintendo 6-4'). Redmond, WA:
Nintendo of America Inc, 1998.
-. Super Mario Brother. (NESr). Redmond, WA: Nintendo of America Inc, 1985.
-. Tetris. (NESr). Redmond, WA: Nintendo of America Inc, 1989.
Poole, Steven. Trigger Happy: the Inner Life of Video Games. London: Fourth Estate,
"Platform Game." Wikipedia: The Free Encyclopedia. Online. 10 Mar 2004 13:58 UTC.
2 April 2004.
Program Music. Alfred's Pocket Dictionary of Music. Comp. and Ed. Sandy Feldstein.
Sherman Oaks, CA: Alfred Publishing Co. 1985.
Rockstar North. Grand Theft Auto: Vice City. (Playstation 2) San Diego, CA : Take
Two Interactive, 2003.
Rogue Entertainment. American McGee's Alice. (PC). Redwood City, CA: Electronic
Saint-Saens, Camille. Danse Macabre. op. 40. (g, 1875). Orchestre National de France.
Cond. Lorin Maazel. Sony, 1995.
,s\////ug. The. Dir. Stanley Kubrick, Perf. Jack Nicholson and Shelly Duvall. Warner
"Skeleton Dance." Dir. Ub Iwerks. Comp. Carl Stalling. 1929. Disney Treasures : Silly
Symphonies. DVD. Disney Home Video, 2001.
Smetena, Bedrich. My Fatherland II. Die Moldau. (T: 111). Boston Symphony Orch.
Cond. Rafael Kubelik. Deutsche Grammophon, 1990.
"Survival horror game." Wikipedia: The Free Encyclopedia. Online. 13 Mar 2004 21:35
UTC. 2 April 2004.
Stalling, Carl. Interview with Mike Barrier. Reprinted as "An Interview with Carl
Stalling." The Cartoon Music Book. Eds. Daniel Goldmark and Yuval Taylor.
Chicago: A Capella Books, 2002.
Stam, Robert, Rogert Burgoyne and Sandy Flitterman-Lewis. New Vocabularies in Film
Semiotics: Structuralism, Post-Structuralism, and Beyond. New York: Routledge,
Strauss, Neil. "Tunes for Toons: A Cartoon Music Primer." The Cartoon Music Book.
Eds. Daniel Goldmark and Yuval Taylor. Chicago: A Capella Books, 2002.
Taylor, Laurie. "When Seams Fall Apart: Video game space and the player." Game
Studies. 3.2 (2004). 31 March 2004.
Thomas, David. "Video Game Vocabulary." [Weblog entry.] Buzzcut. 20 November
2003. 29 March 2004.
Tong, Wee Liang and Marcus Cheng Chye Tan. "Vision and Virtuality: The Construction
of Narrative Space in Film and Computer Games." Screenplay:
cinema/videogame/interface. Eds.Geoff King and Tanya Krzywinkska. London:
Wallflower Press, 2002.
Toy Story. Dir. John Lasseter. Disney/Pixar, 1995.
Ubisoft, Dargaud. XIII. (PC) Morrisville, NC : Ubisoft, 2003.
Valve Software. Half-Life. (PC) Bellevue, WA: Sierra, 1998.
Wadhams, Nick. "Of ludology and narratology." Associated Press. 14 February 2004.
Ward, Paul. "Videogames as Remediated Animation." Screenplay:
cinema/videogame/interface. Eds. Geoff King and Tanya Krzywinkska London:
Wallflower Press, 2002.
What Price Fleadom? Dir. Tex Avery. Writ. Heck Allen. Warner Brothers, 1948.
Wolf, Mark J.P. "Genre and the Video Game." The Medium of the Video Game. Ed. Mark
J.P. Wolf. Austin: University of Texas Press, 2002.
Zach Whalen was born in South Carolina and grew up in East Tennessee. He
attended Carson-Newman College in Jefferson City, TN, where he was awarded the
Presidential Honor Scholarship. He also did well in Cross-Country, earning 2nd Team
All-SAC honors, and continues to run marathons. He completed his undergraduate
honors thesis, titled "Theoryspace v2.03: Applications of Critical Theory in Hypertext
Literature." After receiving his M.A. degree, Zach is continuing in the Ph. D. program in
the Department of English at the University of Florida.