Home | Site Map | Watch | FAQ | History | Store | Contact

Music visualization notes

Sort of like a blog. Topics: music, music visualization.


Visual counterpoint vs. musical counterpoint.

People often talk about "counterpoint" in visual arts as being analogous to musical counterpoint. That would be fine, except that the meaning of "counterpoint" is usually limited to the idea of having more than one thing happen at a time. In music, two additional constraints are applied to counterpoint. The first is that the two (or more) things that are happening at the same time are designed so that they maintain both their distinctness (they don't fuse into a single thing) and their identity (they don't come apart into sub-things). This constraint is reflected in the rules of polyphonic voice-leading. The second constraint is not so often mentioned, but it is equally important: there has to be a way for you to direct your attention through the counterpoint such that you can (at least in theory) keep track of what's going on. Part of this burden is on the listener, but part is on the composer. The secret is: figure out what the most salient features (or transitions) of each melodic line are, and arrange things so that they happen in different parts at different times. Another way of saying this is: arrange things so that there is a possible path of the listener's attention such that all the salient features can be noticed, one after another.


Sight-reading pre-exercises. A couple of weeks ago, I got a little clearer about something that people who can sight-read music fluently have to be able to do: take in a bunch of music at a glance and hold it in mind while playing it and looking at some other music. This suggests some exercises. (1) Look at a measure of a score until you have it firmly in mind, then close your eyes and play it from memory; repeat. (2) Look at a measure of a score until you have it firmly in mind, then switch your attention to a different measure (ideally, on a different page, so you're not tempted to look back), and start playing the first measure over and over; look at the new measure until you're ready to play it, then switch; repeat. (3) While repeating a measure of polyphonic music, switch your attention from one voice to another (singing along is a good way to make sure that you are switching you attention). (4) Same as 1 and 2, but with multiple measures.


I went to Boston to attend the Visual Music Marathon; I took some notes; here they are (in boldface, translated into something closer to English), followed by what I had in mind when I wrote them.

  • How to live with the frame rate? With live action performed in person, there's no frame rate, but as soon as you involve film, video, or computers, there is. I was struck by how different animators dealt with this. Some pieces, like hand-drawn animation with one drawing per frame, were very much tied to the frame rate. Some used techniques to hide the frame rate (like motion blur in Lytle's piece). Many had frame-related visual artifacts, ranging from the obviously intentional (playing with the fact that there was a frame rate, capitalizing on the beating between some rate in the animation and the frame rate) to obviously unintentional (things which would have looked better with motion blur).

  • How stable is a sound? How stable is an image? When you encounter a complex visual scene, your eyes move around it, explore it; the image itself can be completely static (for example, a painting), and yet the experience of it is anything but static. A musical sound, by contrast, has much more of an "at once" quality to it: if you play a complicated chord on the organ and just hold it, your mental state may change as you listen, but it's more like it's changing in response to an aural gestalt; there's much less sense of "exploring" the sound. What does this mean for somebody who's trying to visualize music? Can a static image be an effective representation of non-static music? I think the answer is sometimes yes, at least in part. For example, if there are a lot of staccato notes happening in a chaotic way, a chaotic distribution of small objects might feel like a good match.

  • Abstract music vs. abstract form. I'm no longer sure what I meant when I wrote that. A lot of the pieces had soundtracks that were very far from what I consider "mainstream" music; some would not even be called "music" by most people. I found myself grateful when a soundtrack was something I might have wanted to listen to on its own (which was the exception rather than the rule). I know that my musical tastes are narrow; was my reaction to the soundtracks a reflection of that? The abstract forms of the images didn't affect me the same way.

  • Show beating spatially. What happens when the beat rate exceeds the frame rate? This is an idea about showing the difference between consonance and dissonance. Dissonance is (at least in part) a temporal phenomenon: when harmonics of two pitched sounds differ in frequency by a certain amount, there's an oscillation in their combination known as beating. This beating is at the borderline of being too fast to be shown directly (especially when the frame rate is low). A slow beating (at the speed of, say, vibrato) can be depicted directly, but as the beating rate increases to the point where we consider it "dissonance," it starts getting close enough to the frame rate that in addition to seeing the beating of the sound, you're seeing the beating of the beating frequency and the frame rate. This could be avoided if the beating could be shown spatially: the oscillations could be depicted as ripples, undulations, etc.

  • Isolated event vs. trend (continuum). What in music does music visualization depict? The aural objects (notes, sounds)? The events? The trends? The effects these have on the listener? How do we draw the line between these? These are questions I need to think about more (and which I don't expect to ever stop thinking about).

  • Beating <--> Phase, Angles. One way of thinking about beating is to consider the phase difference of the two harmonics; when the beating is slow, the phase changes are slow; if you were to depict this on a circle, there would be slow motion around it; if it were faster, there would be faster motion. There's still the question of the frame rate (and strobing between the frame rate and the circular motion); if the circular motion left a path (a helix?), it would be very wiggly when there was dissonance.

  • "White" noise, "white" image; what is it? The "snow" on a TV that's not tuned to any station (probably a thing of the past once we go to digital TV) seems very similar to the sound of white noise. What can we learn from this?

  • How to combine live images and artificial? More often than not I find other people's solutions to this rub me the wrong way. Why is that? I think it's my personality; I like to get into one thing at a time; I love nuts, but I resent it when there are nuts in fruitcake.

  • Low-rez; a high-resolution look. Many of the pieces in the show played around with the idea of low-resolution (most memorably I've got a guy running). There's an interesting dichotomy in the idea of "low-resolution": you only know that something is low-resolution by having a higher-resolution way to judge it. This could be an interesting thing to explore in a piece.

  • MIDI + audio analysis tools; profile for each note. If you have a MIDI version of a recording, you can do note-specific audio analysis of the recording and get information that's nearly as good as you'd get if you had recorded each instrument separately. Since a visualization tool based on MIDI could pre-play the score and do this analysis, it could be based much more closely on the audio.

  • All the notes build a picture. I'm not sure, but I think the idea here was that as a piece unfolds, it builds something --- and that a representation of this wouldn't show merely the "now" but everything that had happened. Fischinger's Motion Painting #1 suggests that, but its distant past is completely obliterated.

  • Image --> Description --> Image (with "<--manipulate" pointing at "Description"). Extract regions. One way of looking at music visualization is as finding a description of the music which is general/abstract/formal enough that it could be considered the descrption of an image (and rendered that way). However, there's always the question: what kind of description results in images we want? One way to find this out is to do the reverse: convert images into a formal description, and compare these descriptions with music descriptions. Also, if you have a way to convert between description and image, you could manipulate the description as a way to manipulate the image.

  • Titles-only (or as a visualization). Norman McLaren's Begone Dull Care/Caprice en couleurs had such delightful title sequences that I saw it would be possible to do a piece that was all titles. This could be a music visualization piece, too.

  • How easy it is to do something so much better. With many of the pieces in the show, I had the thought "if I could do what that artist did, I could make something a lot better." Does everybody feel that way?

  • Make a "score" of a visualization, to be performed either totally automatically or interactively. This, of course, implies some kind of notation ... I'm working on it.

  • Long-trend spectral average differences (max/min). This is a method for finding important structural boundaries in a piece of music.

  • Catalog of types of distinctions of techniques. Kind of a meta-catalog. First, you make a catalog of pieces. Then, you make a catalog of techniques used in those pieces. To do that, you first have to identify the techniques, then find the relationships between the techniques: which ones are similar, complimentary, etc. In doing that, you find that you use lots of different criteria for deciding whether two techniques are similar (and they may be similar in terms of one criterion and different in terms of another). This leads to a catalog of types of distinction. That's the catalog I want.

  • The edge. Titles around the edge. With film and television, the border of the image area is a kind of no-man's land. You have to put something there (even if that something is just ... nothing), but you can't put anything really important there, because there's the chance that it will be cut off. As we move toward a completely digital image transmission medium, the situation is changing: I have a digital computer monitor on which all pixels are guaranteed to be there. This means that I can put something right at the edge of an image, and be sure that it will be seen by my audience. I'm not aware of anyone taking conscious, explicit advantage of this. For example, you could have titles all around the edge of the screen. Or, you could start with an image that was typically inset from the edge, but then violate the edge ... leading to the question: what is the edge?

  • Web of music visualization knowledge. What do we know? Where is it? What's the way to catalog this? Should there be a music visualization Wiki, maybe?

  • Marathon: priority ordering. After the marathon, I heard that it had not been reviewed, and that part of the reason for this was that no reviewer could be expected to sit through all twelve hours. My thought was related: when every hour of the marathon is equivalent to every other hour, a casual attendee (who stays for only an hour or so) sees only an average selection of works. If the auditorium were filled to capacity every hour, this might be okay, but it was (if I remember right) less than half full at most. Promoting a small number of hours (say, 7 to 10 pm) as the "cream" of the show would have several advantages. A person who was undecided about whether to attend (knowing that 50% of anything is below average) would be more likely to. People who attended during prime time would be more likely to attend next year. The event would be more likely to be reviewed. Etc. Of course, this makes the show less egalitarian, and there might be fewer hard-core attendees at the lower-ranked hours, but I think the benefits outweight the downside.

  • Use scale space to compare images. Several years ago I talked to Larry Cuba about building a tool that could translate film/video into sound. Now that many of Fischinger's films are on DVD and I've learned a lot more about psychoacoustics and expanded my toolkit, this project, which at the time I talked to Larry about it, seemed like a lot of work, now seems pretty straightforward and tractable. What I realized during the VMM was that scale space filtering would find the motion of interest, and that a mapping between size and pitch/bandwidth (of filtered white noise) would be a good thing to try first.

  • A composite can change direction by having sub-parts slow and change out of sync. I don't remember what this was a response to. The idea, though, is that a shape composed from many smaller elements can change direction in a way that is less definite than the motion of its parts. An example would be a flock of birds: all birds could change direction at the same time, or all at different times; either way, the flock would end up going in a different direction, but in the latter case, the sense of "right now it changed directions" wouldn't be there. This is something to remember to play with ...

  • Re-factor image and controls, re-map. This is an idea for a kind of image transition effect. If you have multiple methods for turning and image into a description, then you can be analyzing an image generated by one kind of description by an unrelated method, and make a transition by switching to the new description.

  • Chair-dance: office stop motion. I've seen a few pieces on this theme, but I think there's still a lot of unexplored potential

  • Spiderweb can change its spacing. A spiderweb is one of those fundamental forms, but I don't think I've seen animation which manipulated it the way rectilinear forms have been. What I'm thinking of is a Mondrian-esque animation but on a spiderweb.

  • An object tracks the voice when it is present. The human voice, especially a single voice, when present in music, jumps out of the texture like no other instrument. I haven't seen music visualization which adequately reflects that.

  • Bowing can go two or more directions. The change of direction of a bow is something that feels very natural --- the sense of "we've gone as far as we can/want/need in this direction, so we slow to a stop and go in a different direction" --- but I've never seen that applied to more than two (opposting) directions.