You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 

271 lines
12 KiB

<!doctype html>
<html lang="en">
  <!-- hi stranger -->
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover">
    <link href="favicon.png" rel="icon">
    <link href="styles.css" rel="stylesheet">
    <title>Creating Music—Part 3</title>
    <meta name="description" content="Senior Capstone Journal Entry 3">
    <script id="MathJax-script" async src="mathjax.js"></script>
  </head>
  <body>
    <header>
      <h1><a href="..">&larr;</a>Creating Music—Part 3</h1>
      <time datetime="2020-06-30">30 June 2020</time>
      <a href="track1.html">Part 1</a>
      <a href="track2.html">Part 2</a>
    </header>
    <main>
      <section>
        <h2>Getting a Deeper Understanding</h2>
        <p>
          Now that I’ve had my fun playing around with high-level
          audio environments (namely Sonic Pi and Schismtracker), I
          want to actually learn how those sounds are made. This
          brings us to the topic of <em>audio synthesis</em>. There is
          <a href="https://www.soundonsound.com/techniques/whats-sound">an
            incredible series of articles</a> by a superhuman Gordon Reid
          titled “Synth Secrets” published in
          <a href="https://www.soundonsound.com">Sound on Sound</a>
          between 1999–2004. Though focusing on analog synths, it
          really helped me to lay the foundational knowledge to pursue
          my capstone project: music through code.
        </p>
      </section>
      <section>
        <h2>What is Sound?</h2>
        <p>
          <em>
            Note: I have never taken a formal music theory class, take
            this section with a grain of salt.
          </em>
        </p>
        <p>
          Any sound can be made using three parts: amplitude (how loud
          it is, measured in decibels dB), frequency (the pitch,
          measured in hertz Hz) and the timbre. The latter is the most
          interesting. Sounds that are “pure,” not changing over time,
          are realllly booooring. A plain sine wave (∿) is an example
          of a pure sound, and it is not at all interesting to listen
          to.
        </p>
        <p>
          Timbre is another way to say the “feel” of a sound. Sounds
          are composed of a main frequency (usually the lowest in
          pitch) and multiple other frequencies at higher pitches (the
          overtones). These overtones create the timbre. If the
          overtones are whole number multiples (2×, 3×, etc…) then it
          sounds “good.” These whole number multiples are
          called <strong>harmonics</strong>.
        </p>
        <p>
          Sounds have two different, yet interchangeable,
          representations.<sup><a id="fnret:1" href="#fn:1">[1]</a></sup>
          The first is a <strong>waveform</strong> or function. This
          can be any continuous line, curve (or otherwise) where the
          \(x\) value is time and the \(y\) value is voltage going to
          the speaker.<sup><a id="fnret:2" href="#fn:2">[2]</a></sup>
          This function can be as complicated as required to produce
          a desired sound.
        </p>
        <p>
          The second representation is a <strong>series of plain sine
          waves</strong>. They are usually visualized as a bar graph
          with \(x\) being frequency and \(y\) being amplitude.
          Converting to a complex waveform to this collection of
          amplitudes and frequencies uses the <strong>Fourier
          transform</strong>, named after French mathematician Joseph
          Fourier:
          \[\hat{f}(\xi) = \int_{-\infty}^{\infty} f(x)\ e^{-2\pi i x \xi}\,dx\]
        </p>
        <img style="display:flex;margin:auto" src="ft.gif">
        <p>
          The main takeaway is that the red waveform and the blue
          series of sine waves are <strong>equivalent</strong> ways to
          describe the exact same thing.
        </p>
      </section>
      <section>
        <h2>From Math → <span class="mono">0b01010101</span></h2>
        <p>
          This math stuff is nice to look at and all, but how can I
          make a sick beat on my computer? For this, I turned to
          <a href="https://www.openbsd.org">OpenBSD</a>’s
          <a href="http://www.sndio.org/">Sndio</a>
          library and server to create the sounds. The
          handy <a href="http://man.openbsd.org/sio_open">manual
          page</a> was incredibly useful as well as Boulanger and
          Lazzarini’s
          <i>The Audio Programming Book</i> (MIT Press, 2011).
        </p>
        <p>
          After creating a handler (an opaque pointer type) with
          <code>sio_open()</code>, parameters for the sound can be set
          using <code>sio_initpar()</code>
          and <code>sio_setpar()</code>. For my test project, I set
          the parameters to 16 bits per sample, two channels (stereo
          audio) and the sample rate to 44100 Hz. Woah woah woah! Slow
          down! What the heck does any of that mean?!? First off, this
          journal entry is <em>not</em> a
          <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>
          tutorial, but it will go in depth on digital audio.
        </p>
        <p>
          So let’s, shall we? First are <strong>samples</strong>. A
          sample is a really small sliver of digital audio. Samples
          are needed in the first place (instead of just functions)
          because computers cannot deal with analog values or fancy
          math things like infinite integrals themselves. So instead
          we estimate, creating discrete samples of audio. Also, we
          may not know what sound we want to make in the future (say,
          if someone presses a key on a keyboard), so samples allow us
          to create audio on demand. Later on, the actual format of
          samples is discussed.
        </p>
        <p>
          When I set the <strong>sample rate</strong> to 44100 Hz,
          that means there are that many samples played <em>each
          second</em>. Well, actually twice that many are played
          concurrently because I set there to be 2 channels of audio.
          Digital audio samples are <strong>interleaved</strong>,
          meaning if there are two channels L and R, the samples are
          arranged like LRLRLRLRLR…. One cycle of these samples (in
          our case just LR) is called a <strong>frame</strong>.
        </p>
      </section>
      <section>
        <h2>Let’s Create a Sine Wave</h2>
        <code>#include &lt;sndio.h&gt;
static const unsigned int SAMPLE_RATE = 44100;
int
main(int argc, char *argv[])
{
        struct sio_hdl *hdl;
        struct sio_par par;
        if ((hdl = sio_open(SIO_DEVANY, SIO_PLAY, 0)) == NULL)
                err(1, NULL);
        sio_initpar(&amp;par);
        par.bits = 16;
        par.pchan = 2;
        par.sig = 1;
        par.rate = SAMPLE_RATE;
        sio_setpar(hdl, &amp;par);
        sio_start(hdl);
        /* Play your samples here */
        sio_close(hdl);
        return 0;
}</code>
        <p>
          This is just the initialization routine: we get
          our <code>hdl</code>, setup our parameters (16 bits, 2
          channels, yes to signed samples and the 44100 Hz sample
          rate), tell sndio that we are going to start playing
          with <code>sio_start()</code>, generate and play samples
          (see next) and finally close up our handle.
        </p>
        <p>
          Generating samples is more interesting:
        </p>
        <code>#include &lt;math.h&gt;
void
play_sine(struct sio_hdl *hdl, double seconds)
{
        double samp;
        double freq = 440;
        double tau = 2.0 * M_PI;
        unsigned int nsamples = seconds * SAMPLE_RATE;
        short samples[nsamples];
        unsigned int i;
        for (i = 0; i &lt; nsamples; i += 2) {
                samp = sin(tau * freq * i / SAMPLE_RATE);
                samples[i] = samp * 32767.0;
                samples[i + 1] = samp * 32767.0;
        }
        sio_write(hdl, samples, sizeof(samples));
}</code>
        <p>
          Sixteen-bit samples seemed to be pretty common when doing my
          research, mapping to C’s <code>short</code> type. Sndio
          gives us programmers a really nice API to create sound with
          <code>sio_write()</code>. After we allocate our buffer
          (called <code>samples</code> above) and fill it with
          samples, we can just pass a pointer to it (along with the
          number of bytes) and, once enough audio is in sndio’s
          internal buffer, we get sound!
        </p>
        <p>
          We increment by 2 to fill up each channel separately (a full
          frame), but at this point it wouldn’t do much difference to
          fill the whole buffer sequentially by one.
        </p>
        <p>
          The goodies are on the first line in the loop, let’s break
          it down: <code>samp = sin(tau * freq * i / SAMPLE_RATE);</code>.
          We can create a sine wave with <code>math.h</code>’s
          <code>sin</code> function. The argument is expected to be in
          radians, not degrees, throwback to trigonometry class! To
          create the angle, we multiply how far we are around a full
          turn, <code>i / SAMPLE_RATE</code>,<sup><a id="fnret:3"
          href="#fn:3">[3]</a></sup> by \(2\pi\) or \(\tau\)
          (<code>tau</code>). Last we can multiply that by the
          frequency <code>freq</code> in hertz we want the pitch to be
          (in this case 440 Hz).
        </p>
        <p>
          Okay great, but there is a problem. <code>samples</code> is
          an array of <code>short</code>, but our <code>samp</code> is
          a <code>double</code>! What transform do we need to do? Its
          actually pretty simple: just multiply by one less than the
          maximum <code>short</code>, which is \(32767\). Then all we
          have to do is assign it to both the left
          <code>samples[i]</code> and the right
          <code>samples[i + 1]</code> channels! Once we have our
          buffer, write it to the soundcard or audio
          server<sup><a id="fnret:4" href="#fn:4">[4]</a></sup> and we
          hear a sweet, sweet sine wave.
        </p>
      </section>
      <section>
        <h2>Wrapping Up</h2>
        <p>
          This journal entry we learned a little about sound and how
          it can be represented mathematically. After that, there was
          an introduction to digital audio, including samples, sample
          rates and multi-channel audio. Finally a little bit of C was
          used, along with OpenBSD’s sndio, to play a simple sine
          wave, created only using code! Next entry I want to focus on
          creating more complex and interesting sounds. This is where
          Fourier transform will be put to use, allowing us to
          surgically filter and create something more interesting. Or
          maybe I’ll end up randomly trying different things out and
          choosing what sounds the coolest. See ya later!
        </p>
        <p>Discuss on <a href="https://lobste.rs/s/ngon45/creating_music_part_3">Lobsters</a>.</p>
      </section>
      <section>
        <h2>Footnotes</h2>
        <ul>
          <li id="fn:1"><a href="#fnret:1">[1]</a> Though you <em>can</em> describe it in plain English too.</li>
          <li id="fn:2"><a href="#fnret:2">[2]</a> It gets a little more complex with digital audio and a soundcard, though.</li>
          <li id="fn:3"><a href="#fnret:3">[3]</a> It is okay that this sub-expression becomes greater than 1 because the sine function is cyclical.</li>
          <li id="fn:4"><a href="#fnret:4">[4]</a> This is the nice part that is abstracted away with sndio.</li>
        </ul>
      </section>
    </main>
    <footer>
      <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" src="images/cc.png"></a>
    </footer>
  </body>
</html>