Songsmith: Music Creation for the Masses

Published

By Rob Knies, Managing Editor, Microsoft Research

When you look in the mirror, do you see Leona Lewis? Do your shower walls reverberate each morning with “Womanizer”? Are you convinced that you just might be the next Taylor Swift?

Let’s face it: We all fantasize occasionally that we could sing our way into the spotlight. Guys, too; they’re just channeling Justin or Bono or T.I. instead. It’s part of human nature.

Microsoft Research Podcast

Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi

Dr. Bichlien Nguyen and Dr. David Kwabi explore their work in flow batteries and how machine learning can help more effectively search the vast organic chemistry space to identify compounds with properties just right for storing waterpower and other renewables.

For most of us, though, it’s a part we keep tightly concealed. After all, it’s even more likely that’s we’re not the new Usher. But now, thanks to a couple of music-minded researchers from Microsoft Research, that’s all about to change.

You, too, can become a Songsmith.

Songsmith (opens in new tab)™ is a new product from Microsoft Research, courtesy of Sumit Basu (opens in new tab) and Dan Morris (opens in new tab), that can take your surreptitious singing and add a tailored instrumental accompaniment, thereby giving even musical novices a chance to experience the joys of sonic creativity.

That the product also proves useful for more accomplished musicians is simply a grace note.

“We take your singing and from that generate the chords that go along with it,” explains Basu, 34, a researcher in the Knowledge Tools (opens in new tab) group at Microsoft Research Redmond (opens in new tab). “Nobody else does that in the way that we do.”

Of course, it’s not quite that simple.

Songsmith

Songsmith enables Sumit Basu (left) and Dan Morris to pursue their musical visions.

“Songsmith allows somebody who has limited or no musical experience to sing into a microphone and get back a piece of music right away,” says Morris, 30, a researcher in the Computational User Experiences group, also based in Redmond. “Backing music has been generated to accompany your voice. It’s not going to sound like you spent a million dollars in a fancy Los Angeles studio, but it’s going to sound musical.

“The goal is to give somebody who doesn’t know anything about chords or music theory a way to make something that is authentically musical, more than good enough to make a cute birthday card for Mom or a Valentine’s Day love song. That’s the core functionality of the application, to give you a taste of songwriting.”

And, aspiring vocalists and songwriters, a free trial download (opens in new tab) is available, good for six hours of experimentation. If that convinces you of its value—as Morris and Basu feel certain it will—a full-fledged version can be purchased for $29.95 in the United States or €29 in the European Union.

The researchers both are accomplished musicians. Morris plays in an ’80s cover band called Rewind (opens in new tab), and Basu has a portfolio of self-penned tunes (opens in new tab) online.

“Neither of us has extensive formal training in music theory,” Morris says. “We both like to sing, we both like to play guitar, we’ve both played in bands, we both like to write songs. It’s fun for us. That little bit of fun that we have when we make new music, we want to give a taste of that to everyone.”

In a way, Songsmith enables users to shrug off years of indoctrination about what it means to be “musical.”

“I remember my sister and I being in the car,” Basu says. “I would be singing something, and my parents would say, ‘What song is that?’ My sister would say, ‘Sumit is just making it up,’ and they’d say, ‘Well, you should sing a song that actually exists.’ Before we learn that we’re not supposed to create music or we’re not able to do that, it’s something that exists in our hearts.

“It’s not very pleasant to listen to someone sing in isolation, though. Having a backing track that fits with what they’re trying to sing makes for a better overall experience. That is going to be our biggest contribution, that empowerment to let people go from having a musical idea or thought or melody to something they’d actually want to show other people.”

But is it complicated?

“The first thing you see when you load Songsmith is a glowing Record button,” Basu says. “The first thing we want the user to do is hit that Record button and start singing. To help you along, there’s a drum track that’s playing, and you sing along. As soon as you’re done singing, you hit Play, and it plays back.

“At this point, you can start manipulating the song without having any musical knowledge. There are two principal sliders, which let you adjust the ‘happy factor,’ which makes the song happier or sadder, and the ‘jazz factor,’ which makes the chords more adventurous or more conservative. You can also change the style and the sound of how those chords are being played.”

Songsmith will ship with 30 unique musical styles, enabling you to give your song a reggae background or a hip-hop feel, to alter tempos, and to change the prominence of vocals and instruments in the mix.

“Once you’ve got your chords worked out and you like the backing,” Basu says, “you can hit Record again and record yourself singing over this accompaniment you like, and you get this really nice final take. The great thing is that you can save the file and send it to your friends if you want. You can export it directly to a Windows Media Audio file that you can put on your Web page, on MySpace, on Facebook. And you can take the song you just generated and make it into a Windows Movie Maker video where the soundtrack is the song you just made. Lip-sync along with your song and you have a video.”

Songsmith mimics the creative process professional musicians use.

“You can not only get that piece of music,” Morris says, “you can manipulate it in a way that’s not totally dissimilar from what a songwriter would do next. A professional musician might be able to explore potential chord progressions quickly. To a non-musician, that’s just total black magic. How would you come up with one progression, much less many?

“We give you a bunch of tools in Songsmith that don’t require you to know about chords and music theory to explore the space of different possible chord progressions. It’s not just a one-off, 10-second interaction. You can really work on a song, even if you have no experience in music theory.”

Such deep functionality makes Songsmith valuable even to more accomplished musicians. It won’t replace sophisticated pro music software, but the product meshes nicely with the songwriting process.

“What do songwriters do?” Basu inquires rhetorically. “You’re walking down the street, you’re doing your laundry, you come up with this amazing song idea, and you reach for the nearest Record button. Making music takes a long time, and you fall into ruts where you use the same chords over and over again. Now, when I record something using Songsmith, even if I don’t love the first version, I can keep it and look at it later.”

user-friendly interface

Songsmith features a simple, user-friendly interface that gives novice musicians powerful ways to shape their sound.

Creating music applications is not exactly the researchers’ prime professional focus, but in this case, their personal interests simply happened to dovetail with a research question worthy of pursuit. In fact, in October 2006, on Morris’ second day after joining Microsoft Research, he had a fateful discussion with Basu.

“There is a quick and clear transformation from melody to chords that go with it,” Morris says. “If we could capture that in an algorithm, we would really have a powerful tool. That was an unsolved problem when we started the project.

“There is lots of research in pitch tracking. There are lots of products that will play music in certain styles. But there was this great thing missing in the middle. If we could take pitches and produce chords to go with them, there’s really a nice complete pipeline, from voice to pitch to chords to accompaniment. If we could solve that piece of the problem, we could enable a whole, powerful system.”

Adds Basu: “What was really cool is that we had both independently thought about this. We got some data and started analyzing it with an intern, Ian Simon (opens in new tab). And there was an epiphany: ‘OK, it turns out our intuitions were in the right place. You can take a melody and predict the chords reasonably well. There’s some flexibility in which ones are right, but you can do a good job with that. What should we do next?’

“What if you could pick up a microphone and sing into it and have something come out with your voice and those chords? We all looked at each other and said, ‘OK, this is going to be a lot more work, but it will be so much cooler and it will reach so many more people.’ “

Simon, a University of Washington graduate student with a background in machine learning and artificial intelligence and with an incredible ear for music, stepped to the fore. Simon would become a major contributor to the project and would continue to collaborate with Morris long after this internship, as the keyboardist in Morris’ band.

“We intended his summer 2007 project to determine if we could do the mathematical portion of the project,” Morris states. “We did not have any intention of, in one summer, completing the process of ‘you sing and it goes all the way to music.’ Ian was a rock star.”

Then work turned to perfecting the user experience.

“How do we translate this,” Basu said, “into an experience accessible to people, still useful to musicians, and with the right kind of controls? The evolution of both the research and the product has been trying to think about how to get that user experience right, how to translate those things into people’s contexts, how to evaluate how this works. Does it actually generate good chords? Are people able to use it and be successful at it? Are they producing reasonable things?”

Added Morris: “If this didn’t look good, it wouldn’t invite people to do something they were scared of. It couldn’t look like a professional music app. It had to look friendly.”

To reach this goal, Morris and Basu worked with Microsoft Research’s Advanced Development Team (opens in new tab) (ADT) to take Songsmith from a research prototype to a full-fledged product, with user-friendliness as a primary goal. Richard Hughes, a principal development lead for ADT, served as primary developer for the project. Jim St. George, lead user-experience designer for ADT, served as the project’s designer, adding a hand-drawn look that invites novices to play with the application. ThuVan Pham (opens in new tab), software-design engineer in test for ADT, served as lead tester, validating the quality of the released program.

Tim Aidlin (opens in new tab), a user-experience designer for Microsoft’s Developer and Platform Evangelism group, also volunteered some of his time, and the product benefited.

“Tim made the whole workflow look really nice and beautiful,” Basu says, “the design simplified, good but very functional. It was very clear as to how to use it.”

So slick is the user experience that it conceals the serious computer science that makes Songsmith sing: machine-learning algorithms, model constraints and parameters, Hidden Markov models …

“You need a Ph.D. in computer science to understand that,” Morris says, “but you don’t need a Ph.D. in computer science to use that.”

The proof came during user studies.

“The last or second-to-last study we did involved people we recruited who felt like they could sing,” Basu recalls. “We had them sing a few songs, and Songsmith produced accompaniments. We had asked for people who didn’t know how to write chords or music notation. The looks on their faces when they sang and played with some sliders! They said, ‘So how do I get this at home?’ and ‘Can you send me the mp3s of the songs I made?’ It became clear to us that people were going to be really excited about this.”

Though Microsoft Research developed the core of Songsmith and the user experience, building Songsmith also depended on partnerships to provide a rich library of sounds and a diverse set of musical styles. Behrooz Chitsaz, director of IP Strategy, who seeks new product opportunities for Microsoft Research, helped cement deals with music-industry heavyweights Garritan (opens in new tab) and PG Music (opens in new tab) to make sure the sound was as professional as the user experience.

Songsmith is the second consumer product from Microsoft Research, the first being AutoCollage, which, with a click of a button, enables a user to create an attractive collage from a collection of photos.

With Songsmith in the marketplace, the researchers now shift to nurturing their new creation.

“The most exciting next steps to us are doing what we can to reach out to the community of potential users,” Morris says, “We’ll be working on the forums, answering questions, encouraging people to make music with Songsmith and share it with each other.”

Basu concurs.

“We want to reach out to all the communities that would like this,” he says, “to do a bit of evangelism to musicians and to kids and music-education programs.”

Given their pride about Songsmith, expect Morris and Basu to bring a lot of energy to their community outreach.

“The coolest thing,” the former says, “is that people who have never had a way to create something originally musical before will now be able to sit down at their computers and create something that might not be award-winning but that gives them a taste of songwriting and music creation. Maybe that will be a way that we turn people on to music creation who never otherwise might have picked up an instrument or written a song.”

For such users, Songsmith could sound a joyful noise indeed.

“I use Songsmith at home for at least an hour every day,” Basu smiles. “I have never worked on something that has significant research in it, that I personally find so useful, and that still feels magical to me. Even though I know everything that’s going on in terms of the math and what Songsmith does, the fact that’s it’s so useful and yet does this magical thing—takes something time-consuming and annoying and makes it fun and easier … I can’t wait to have other people start playing with this.”

Continue reading

See all blog posts