Chris Healy

Dubbing Crosstalk

Chris Healy — Mon, 21 May 2018 02:44:13 +0000

As a part of a project for my Desktop Publishing class, I set my sights on dubbing a portion of a 1987 crosstalk performance by Ma Sanli and Wang Fengshan. Crosstalk (相聲/相声/xiàngsheng) is a Chinese comedic tradition in which two people perform a dialogue, in a way that is something in between standup and Abbot and Costello. Of course, it’s not the sort of thing that you’d generally expect to be dubbed; it’s barely suited to translation, and if you must translate it, subtitles are objectively a better option. But still, there’s nothing like a good challenge!

Here’s the original video (my portion is from about 5:10 to 10:10):

And this is my dub:

The Project

The main tool I used to create my dub was Pro Tools. This was mostly a decision of convenience, since I already owned the software and could work on the project on my laptop wherever and whenever. Of course, it’s more than just that. Using a digital audio workstation on my own computer and not in a decidedly non-recording-studio computer lab meant that it was possible for me to record and rerecord straight into the program. This made the whole process much easier than just trying to get one perfect take on a file recorded elsewhere and then bringing it into the DAW to be edited. Because of this, and because I was making the recordings myself, I was able to focus on getting the best takes for each line, which is generally better practice. When working with audio, it’s always preferable to start with the best recording you can get, because there’s only so much you can do to make a bad recording or bad performance sound better after the fact. I did my best to adhere to this practice, although my performance lacks comedy, I didn’t try very much to sound like two people, and I was speaking quietly in small spaces to not disturb people around me. (Ma Sanli and Wang Fengshan have distinct voices and incredible tonal range, so when you don’t find it very comedic, blame my performance and not theirs. The two of them are hilarious, I promise.)

Recording

The first thing I did was give myself two scratch tracks as a guide, one for each performer. I set the window to scroll so that I could see the original waveforms coming and time each line to be at least approximately where it needed to be.

Here you can see here the first scratch track I put down (in purple), and by the waveforms you can tell that it’s close, but not quite the same as the original. Once I had both my scratch tracks ready, I went back through it, keeping what was already usable and rerecording everything else one or two lines at a time. This was made possible by selecting the exact time for the line (seen below in the start and end fields) so that I could start recording exactly when the line starts, and set a short pre-roll so that I could still hear what comes before and be prepared.

Don’t worry, meter and tempo on the right have nothing to do with this project. You’ll notice that the important parts of the transport window all display minutes and seconds, not bars and beats.

Unfortunately during this process, I fell somewhat short of Very Best Practice, since I recorded on different days in two different places, and even after the rest of the process, this difference is still audible. Garbage in, garbage out.

Extras, Editing, and Mixing

Getting all the lines recorded right was a big process, but only the tip of the iceberg. I added the sounds of the audience’s laughter, shuffling, and coughing from recordings purchased online. For the laughter, I used markers to denote when and how people laughed in the original recording, and then found laughter in my new samples to match.

In this screenshot you’re supposed to notice how I used markers (with yellow diamonds) to help me place laughter, but look how nice those wave forms match up in the dialogue, isn’t it beautiful?

To make everything nice and smooth, I compressed and normalized the dialogue tracks independently, and then put more compression and gain on the master track to keep the whole thing up at higher levels without clipping. (A project like this requires different and less intense compression than music, however.)

Challenges

As I mentioned earlier, this particular project presented unique difficulties. The first was simply that the rhythm between the two performers is rapid and flexible, which was not easy to recreate. This is one of the things that made being able to go line by line so important, and by the same token it was unforgiving, leaving almost no margin for anything to not match up exactly with the original.

The biggest challenges in this project, however, were mostly due to the nature of the original recording. This isn’t something recorded and produced in a studio whose process is easily recreatable, where the sound of the environment on screen is just a believable illusion. This is a live recording made with what looks to be five microphones, not necessarily close to the performers. They sound the way they do precisely because they’re really there. So, I had a lot of work ahead of me to make my dub sound like it happened on that stage and in that room. In the original, there is a relatively high noise floor from how hot the mics are and a broad stereo sound, even if Wang and Ma are generally in the center. There was only so much I could do with my mono recordings, but I carefully calibrated reverb to create space and spread each speaking track more broadly. I also duplicated the background noise track, offset it, and panned them left and right to create a general sense of space for the speaking tracks to occupy.

In the end

I would like to be able to devote more time to this or a similar project one day, but just like translation, audio projects aren’t things that you ever finish, you just stop working on them. I achieved what I think was a respectable result and when replacing the original audio track (bottom track) with mine (top track) in iMovie, it was clear I managed to replicate the original track pretty closely and got to similar levels generally with fewer clipped peaks where the recording goes above 0 dB.

Unicode and Han Unification

Chris Healy — Thu, 29 Mar 2018 20:42:37 +0000

Around two weeks ago, I had the privilege of seeing Ken Lunde give a presentation on Ten Mincho, a new Japanese font from Adobe. One part of the presentation — talking about the font’s ability to maintain differences between base characters and CJK Compatibility Ideographs — seemed a little beyond a few people, including myself.

So, over the past week I dove into research to understand it, and I thought it might be nice to share with everyone what I learned about a little thing called Han Unification and one of Ten Mincho’s best features.

Image source: Wikipedia

First, Unicode

At the center of all this is Unicode. Known in the news as the emoji people, they set the standard for character encoding, aiming to be as universal as possible. The goal is that you should be able to have arbitrary combinations of languages in one document without having to switch how you interpret their encoding to display them.

Unicode is huge, and currently includes 139 scripts. A good chunk of its code points are the Chinese characters common to Chinese, Japanese, Korean, and, in some contexts, Vietnamese. Unicode calls these CJK Ideographs, and while I know they’re not really all ideographs, I’m going to be using that language as well in this post.

Han Unification:

These characters, the CJK Ideographs, are common to multiple countries and languages, and there are also thousands upon thousands of them. Using as few code points as possible to encode them all makes sense. After all, the Latin alphabet isn’t encoded twice for English and Icelandic just because the visually similar but unrelated p and þ must be separate.

Unfortunately, it’s not quite as simple as that. Letters like “A” are shared between Latin, Greek, and Cyrillic, but are encoded multiple times because each one is truly not quite the same. In the CJK Ideographs, the line between stylistic variation and a meaningful difference can get similarly blurry. There are clear differences, like simplified and traditional characters used in Chinese, or shinjitai and kyūjitai (literally new and old forms) in Japanese. Then there are the more subtle variations that have nothing to do with simplification, like 內 and 内 which are the same character and mean the same thing, but one is constructed with 入 and the other with 人. There’s 說 and 説, both being the same traditional character, but with the first form being more common. There are characters like 令, which often appears as 令 in simplified Chinese text, 令 in traditional Chinese text, and 令 in Japanese text (though this appearance is not unheard of in Chinese). There are characters with the 言 radical, which in Chinese usually begin with a diagonal stroke on top, while in Japanese that first stroke usually appears as a flat line, forming “言” (also not unheard of in Chinese). Then there are characters with the 示 radical, like 社, which can appear that way or as 社. In Chinese this variation is equivalent but feels stylized and stuffy, but in Japanese it is considered the traditional (kyūjitai) form.

To continue to use imperfect metaphors from the world of the Latin alphabet, there are variants that are completely equivalent from a general perspective, and are decided by fonts, like variations of a or g; historical variants that you might want to insert without changing every normal instance of that letter, like ſ (the out-of-use, historical long s, as in “Congreſs” or “the purſuit of happineſs”); or variations for different contexts, like capital letters. Then there are things like spelling reforms. Think color/colour, or skillful/skilful. This last part is relevant because it’s not the strokes or the radicals that are encoded and entered, it’s the whole character. When it comes to English, I can think of at least one country that wouldn’t be too happy to depend on specific fonts to spell things “right.”

So how are they all encoded?

In some cases, variants get separate code points entirely. This is the case for almost all Chinese simplified characters and the majority of Japanese shinjitai characters. This is also the case for 內 (U+5167) and 内 (U+5185), as well as 說 (U+8AAA) and 説 (U+8AAC). In these circumstances, it becomes easy to select variants directly through your own input. Many fonts support both variations, but some will choose only one to support, and sometimes even make that one appear like the other!

In the far opposite side of cases, Unicode makes no distinction between variants, as with 言. In these cases, it is entirely up to fonts to decide the appearance of these characters. Sometimes this means choosing a font that displays the character the way you need, and in some contexts this means specifying which appearance you need by tagging text by language (like in HTML). On this post I have used language tags to control how these characters display, but it is up to your browser to select appropriate fonts (so if you saw no difference between the 言’s and the 令’s, that’s why).

In between these two extremes is the situation that is most relevant to the Ten Mincho presentation.

Unicode is not the first encoding that’s ever been invented for Chinese characters, and older encodings are still used in some places in East Asia. To ensure that Unicode is compatible with these systems, separate code points exist for the same character or for variants, called “CJK Compatibility Ideographs” by Unicode. This ensures that going from one system to the other and then back again doesn’t lead to the loss of characters. What this also means, however, is that you might eventually paste such a character (a CJK compatibility ideograph) and end up with the base unified character, potentially losing the appearance you may have specifically chosen. This is the case for 社 (U+FA4C), which is a CJK Compatibility Ideograph that might get turned into 社 (U+793E), which by itself won’t look the same as U+FA4C unless it’s being displayed by a Korean font (like so: 社). It is interesting to note that 令 (U+4EE4) also has a Compatability Ideograph defined in the Japanese style (U+F9A8, 令), but since Japanese fonts display U+4EE4 with that appearance anyway, it is unlikely to change appearance even if it gets changed into the base character.

Unicode’s “Standardized variation sequences” provide a solution to this loss of information. By adding a variation selector (from FE00 to FE15) to the preceding character, variants can be specified by the author and preserved. So for 社 (U+793E), the variant defined as matching 社 (U+FA4C) is 793E FE00. A complete list of variants is available from Unicode here. Their correct appearance just depends on fonts and input methods/software supporting this function.

This process is exactly how Ten Mincho deals with the problem of the variant forms of CJK Compatability Ideographs being lost by reverting to their base characters. It provides glyphs for the base character’s SVSs with the same appearance as the compatibility character someone might otherwise be tempted to use. This makes Ten Mincho a very useful font for Japanese indeed, because multiple versions of one character can be used in one document without needing to switch fonts. It also means the distinction between versions isn’t tied to the document but to the character itself, and will survive being copy and pasted between other documents and back again.

What it all means

The complexity of this all means that any kind of program or function that needs to deal with these characters has to take all these little complications into account. Search functions or machine translation engines need to know which characters are equivalent. Fonts need to decide which code points they need include in order to be useful, and whether they will support enough variations for one region and language, or many.

And for the rest of us:

Choosing your font matters, since for a good number of characters, it remains the main way to choose the way they appear. For Chinese, fonts often support traditional and simplified characters alike, but will still be available in “TC” and “SC” varieties where the difference is in the appearance of unified characters like 全/全 or 花/花. It also means that you need to have a font that supports all the characters in your document. If your font is missing a glyph you might be able to switch the character to an variant that the font does support, but there’s no guarantee that variant will be equivalent to the author of the document. (Of course, with Ten Mincho you would worry about this much less.) As is the case with so many things, paying attention to fonts early can avoid complications later.

It also means that we should notice and take advantage of things like variation selectors and the fonts and software that allows for their use! I for one am hoping that this becomes more widely known, used, and available.

If this has gotten your interest like it has mine, I would definitely recommend Ken Lunde’s CJK Type Blog for further reading!

Multidirectional Functionality through CSS

Chris Healy — Fri, 22 Dec 2017 14:45:00 +0000

What’s it for?

Much of the Internet is in English, and many American web developers rarely have cause to think beyond languages like French or Spanish when it comes to internationalization. From a development perspective, such languages function more or less like English does, and so the general structure of the webpages tends to assume a layout based on your average book or magazine, with its left to right procession of text down the page. There’s a problem there, however, and it is that other languages with other writing systems exist, and the Internet isn’t just for English speakers. If you absolutely must have an financial reason for making websites work for non-left-to-right scripts, know that Arabic speakers account for no small portion of Internet users; the kind of audience you wouldn’t want to just pass over. However, making websites more flexible is important on principle alone. If knowledge of left-to-right scripts is a requirement for using the Internet, then it can never live up to its popular image as a great equalizer.

How it Works

To illustrate the process of adding right-to-left functionality to a left-to-right website, I and two of my peers, Olivia Tsao and Nick Chang, localized a website we created and filled with content from Wikipedia into Arabic. We also could have chosen other right-to-left languages such as Farsi, Hebrew, Pashto, or Urdu, among others.

The original website had two pages, looking like this:

And our final localized website looked like this (none of us know Arabic, so the site is machine translated):

Three Strategies

The gist of the process is to mark the Arabic version of the page as right-to-left by adding “ dir=“rtl” “ in the html tag, and then to target the relevant styles of the relevant elements (which is not necessarily all elements, as we will see). This is best explained through an example:

Here’s the style of a hypothetical class:

.class{
color: blue;
border: solid, red;
float: left;
margin-left: 2px;
padding: 2px, 0, 3px, 4px;
}

Basic

The only styles from this class that we need to change are ones related to layout. The most basic way to do this is to add a section of CSS that targets the same class on right-to-left pages and include only the layout-related styles, each with the values for left and right swapped. Not creating parallel, separate classes saves code and maintains the developer’s ability to update non-layout styles on all pages at once. The result:

html[dir=”rtl”] .class{
float: right;
margin-left: initial;
margin-right: 2px;
padding: 2px, 4px, 3px, 0;
}

It’s important to notice here that it is not enough to simply replace margin-left with margin-right, because the original margin-left code from the unmodified .class style will still apply unless margin-left is specifically reset for rtl elements.

Better

Another method gets around this confusion with the following solution:

html[dir=”ltr”] .class{
float: left;
margin-left: 2px;
padding: 2px, 0, 3px, 4px;
}
html[dir=”rtl”] .class{
float: right;
margin-right: 2px;
padding: 2px, 4px, 3px, 0;
}

Here, separate layout styles are created for each direction so that no sneaky values need to be noticed and reset. Going through the entire style sheet by hand and painstakingly rewriting every section that has to do with layout is hardly ideal, however. It is time consuming at best, and at worst it is a mistake-prone way to work, which in turn requires more time for corrections.

Best

The ideal case is to have a website built using flow-relative values, that is, using “start” and “end” in place of “left” and “right.”

.class{
color: blue;
text-decoration: underline;
float: inline-start;
margin-inline-start: 2px;
padding: logical 2px, 0, 3px, 4px;
}

This way, one block of styling can refer to elements and pages of all directions, and all it takes is “dir=“rtl”, “ a single line of code in the html tag, or even just in a relevant container tag for the browser to interpret the style in the opposite direction. The advantages of this method are obvious: layout styles require half as many lines, which makes it easier to keep track of, and since directions are not styled separately, updates to layout only need to be made once per style. It also gives the added benefit that any styles that should not change can be hard-coded as “left” and “right,” or the relevant element labeled “dir=“ltr” “ so that they are not rendered backwards. The unfortunate thing is that only Mozilla currently has full support for these values, with the others crucially missing support for the values float: inline-start and float: inline-end. Work is ongoing on these logical properties, so I strongly hope that these values will gradually come into standard usage because this really is a much better system, and is neither superfluous nor an annoyance.

Important considerations

Certain elements may require extra attention; often in the form of graphics that need to be flipped. On this site, that was the language picker. Because the vertical line and arrow indicating its dropdown function is in fact a .jpg file, it cannot be flipped automatically. Thankfully, the solution is simple: just add a flipped version of the file to the site’s assets and reference it in the appropriate line of CSS.

It is important to note that right-to-left pages are not automatically the mirror image of left-to-right pages. Media players, for instance, are based off of physical, real world devices which are laid out the same everywhere, and should not be flipped. Material Design has a great piece detailing when and when not to flip elements.

Beyond horizontal

For one extra demonstration, we localized the site vertically in traditional Chinese (again machine translated).

Note: vertical text in Chinese looks best with a font whose punctuation glyphs are centered. Tim Saar has an excellent article on using custom fonts for languages that make use of Chinese characters.

The process is similar to the one described above, though different styles need to be changed. (Note: this is NOT the same thing as just rotating all your elements.) Height and width must be switched for most elements, but float: left and float: right are automatically reinterpreted as top and bottom. float: inline-start and float: inline-end will produce the same result, since flow-relative values can handle vertical orientations as well (using flow-relative values also avoids the need to switch height and width). If this seems convoluted, let us take this moment to meditate on the nature of human language to be messy and full of exceptions, even when that language is code.

Again, however, Mozilla stands alone in its comprehensive support for vertical orientation; buttons and forms display correctly in Firefox but no other browser.

The contact form and language dropdown in Firefox:

The same in Chrome:

While vertical layouts online are uncommon online, it is not the case that they are never used. Vertical layouts are often used in individual elements on pages, as in this Wikibooks page explaining the first line of the Analects, displaying Chinese and Latin scripts both vertically and horizontally:

The traditional Mongolian script can only be written vertically, as can be seen on the traditional script version of the president of Mongolia’s website:

The traditional Mongolian script is now largely confined to Inner Mongolia, in China, while Mongolia itself mostly uses Cyrillic. Chinese and Japanese are more often horizontal than not. There are a multitude of factors that go into determining how the speakers of a language choose to transcribe it, but the mechanics of webpages should have no part in making that decision for them, or in contributing to the homogenization of the world’s forms of writing. Knowing the great variety of possibilities for layout and the power of flow-relative styling (link again because it’s important) can go a long way to preventing that.