Markup

The aim of SGX is to minimize the need for direction from the user, by automatically producing facial movement that accurately matches the speaker's vocal performance. However, there may be some cases where you want more control over the outcome. For this SGX provides a markup system, whereby you can insert markup tags into the transcripts. An SGX markup tag consists of comma-separated attribute settings surrounded by parentheses.

Markup Example

Each markup tag applies to everything that follows it, up until the next tag. Thus the position of the tag in the transcript is important. For example:

CODE

(mood=serious, mag=1.0, speed=0.9) So he ran out of the way and the next morning 
it turned out that his servant had actually (mood=fearful, mag=1.2) died 
in the landslide further up the hill sometime before he warned him 
(mood=normal, mag=1.0) so that's my ghost story

In the above example:

(mood=serious, mag=1.0, speed=0.9) applies to "So he ran out of the way and the next morning it turned out that his servant had actually".
(mood=fearful, mag=1.2) applies to "died in the landslide further up the hill sometime before he warned him".
(mood=normal, mag=1.0) applies to "so that's my ghost story".

Markup Attributes

attribute	description	range	default value	tag examples*
mood	Select a mood or engage automatic moods	Any of the moods in the Mood Library or "auto" or "default"	"default"	mood=happy
voice mode	Select a voice mode or engage automatic voice mode	"effort", "auto", or "default"	"default"	voice_mode=auto
magnitude	Increase or decrease in magnitude of movements	0.0 to 2.0	1.0	magnitude=1.3
speech magnitude	Increase or decrease in magnitude of speech movements	0.0 to 2.0	1.0	speech_magnitude=0.8
speed	Increase or decrease in speed of movements	0.0 to 2.0	1.0	speed=1.2
hyperarticulation	Increase in articulatory effort	0.0 to 1.0	0.0	hyper=0.3
jawmax	Limit on jaw opening	0.0 to 1.0	1.0	jawmax=0.4
time	Time of the tag in seconds	0.0 to the audio duration	computed from transcript position	time=12.5

*Note attribute names don't need to be spelled out in full, the name may be cut off at the end as much as you want as long as it remains distinct from the other attributes. For example, instead of magnitude=1.4, you may type magn=1.4 or even ma=1.4. However m=1.4 wouldn't work because that is ambiguous between magnitude and mood.

Unspecified Values

You don't need to specify every attribute in a tag, only the ones you want to change. Attributes that are not specified in a given tag simply continue to have the same values that they had in the preceding tag if there is one, or their default values if it's the first tag in the transcript.

For example:

CODE

(mood=serious, mag=1.0, speed=0.9) So he ran out of the way and the next morning 
it turned out that his servant had actually (mood=fearful, mag=1.2) died 
in the landslide further up the hill sometime before he warned him 
(mood=normal, mag=1.0) so that's my ghost story

The speed attribute is 0.9 for the entire transcript, since it's set in the first tag and not overridden in the second or third tag. The hyperarticulation attribute hasn't been set in the first tag so it takes its default value of 0.0. But from the second tag onward it's 0.1.

By the same logic, when no tag is present at all, all of the attributes have their default values.

Markup Without Transcription

Even without transcription (remember: transcription is optional but preferred), you can still insert markup. For example:

(mood=serious, mag=1.0, speed=0.9)

is a transcript with only markup, and no actual transcription of the dialogue. Since there are no words to define the timing of the tag, the tag will apply to the entire file, unless a time value is given using the "time" attribute. For example:

(mood=serious, mag=1.0, speed=0.9, time=1.4)(mood=fearful, mag=1.2, time=2.8)

In this case, the first tag is 1.4 seconds into the file, and the second tag is at 2.8 seconds. Just as with transcript-inserted tags, the tags at these time points define the properties up to the next tag.

Batch Markup

Batch markup can be done via SGX-GUI or through the command line. You can apply a Mood, enable Voice Mode, and modify values for Magnitude, Speed, Hyperarticulation, and Jawmax. Note that if you apply specific tags in the transcript(s) they will override the batch tags.

To apply markup values via SGX-GUI edit the values in the fields in the "Batch Markup" portion of the interface.

To apply markup values via the command line use the -m command line option followed by the values you would like to set. Here is an example:

-m "mood=happy, mag=1.25, speed=0.5, hyp=0.1, jawmax=.75"