XML is like a language. But, just as two people who speak perfect English might not necessarily understand each other, XML isn’t perfect either.
Or, you could put it another way: XML is perfect, but the humans always manage to screw it up. Somewhat similar to how your favorite baker (who also happens to be a legend in your area) screws up your special wedding cake.
This article explains what XML is, in as simple a language as I can muster (what did I tell you about language just a couple of lines ago?). Later, we’ll see how XML is used, and why there are problems.
Film cutting class – on the phone
You have a film-cutting-taught-on-the-phone business. You cut film, and your student is on the other end with copies of the same pieces of film (this is an expensive class). Let’s pick up the conversation at an interesting juncture:
You: Now, pick up the third film…
Student: Excuse me, which one is the third film?
You: Didn’t I color code them?
Student: Yes, but my nephew ripped them off. Now I can’t tell which is which.
Let’s pause here for an infomercial break. You have no choice but to label your film clips with absolute clarity. Furthermore, each clip must have a unique name. Modern file systems all follow this principle, so it is unlikely your clips will ever have the same name.
We’ll fast forward through the parts where you get your clips renamed again. Then, in a tense moment of suspense:
Student: Okay, I’ve got the four clips lined up.
You: Excellent. We’ve got to make this quick. You’ve only got thirty minutes left.
You: Cut off the first three frames from clip two.
Student: Oh no. Wait a minute….(long expensive pause)…From which end?
You: From the beginning. Keep your film straight, and it’ll be to your left.
Student: Are you sure?
Student: Do I cut between frame three and four, or is it okay if I it isn’t perfect?
You: It’s perfectly okay if it isn’t perfect. I’ll be glad to ship you a new batch of film next week.
Frame-perfect accuracy is mandatory in editing. Today, instead of counting frames, we have timecode. To keep the analogy simple to understand, I’m going to continue using frame counts instead.
In an earth-shattering twist:
You: Is the edit done?
Student: Yes. It looks a mess, though.
You: That’s fine. You obviously have to spend a lot more time with me. Now we’re going to do a dissolve.
Student: A what?
You: A cross dissolve. Instead of a cut the two frames are going to blend into each other.
Student: How do I do that?
You: You’ll have to figure that out for yourself. Unless, you know…I’ve got a few special blends that I sell at a small additional charge.
Student: But I can’t afford any more fees!
You: Well, I’ve told you where the dissolve should start and end, you can figure out the rest.
Student: Wait a minute. Hello. Hello!
Slam cut to black.
Effects are art, and sometimes under copyright. They cannot be easily copied or matched without some form of payment or penalty.
Here’s how a Non-Linear Editor (NLE) would put together four pieces of video:
This small example has:
- Edit points between clips.
- A cross dissolve.
- A Page curl to black effect at the end.
- Overlapping audio.
- Overlapping video.
- Mismatched frame rate (last clip).
- Retimed video (first clip has been slowed down by 50%).
What do we make of this?
The EDL (Edit
Since computers can’t speak, they need to communicate with written instructions. How do you go about writing the above sequence? There are two ways:
- You write down everything, including the steps (like a recipe).
- You write down the events (the milestone points).
Obviously, the first method isn’t very economical. After all, in the time it takes to write down everything, you might as well do it yourself piece by piece.
The second solution works well for video. Between two edit points there’s just a long expanse of clip, where nothing changes. If you can mark the ‘In’ point and the ‘Out’ point of each clip, and place them on a linear scale, you might be able to reduce your word count.
Such a simple list of events is called an Edit Decision List or EDL. It’s a ‘list’. What if you had to write the list yourself? Would it look similar to this:
- Start at frame 0
- Video: Frame 1 to frame 49: Clip 1 25p at 50% speed
- Audio: Same.
- Video: Frame 50 to 199: Clip 2 25p
- Audio: Same.
- Video: Frame 150 to 299: Clip 3 25p
- Audio: Frame 150 to 399: Clip 3 25p
- Dissolve at 300
- Video: Frame 300 to 399: Clip 4 29.97p conformed to 25p
- Effect (Page Curl to black) at 399
- Stop at frame 400
What if this list is generated automatically? There has to be some set way in which everything is written. The above sequence generates the following EDL:
There’s a title, followed by rows of clips in the order they are on the timeline.
All the letters and numbers actually mean something, and are easy once you know what they are. To know more about EDLs and how they work, read the excellent and simple-to-understand Final Cut Pro 7 Manual on the subject.
The EDL was the standard for many years when effects were few and tape was king. In fact, if all you need is to move your timeline about, an EDL is still a perfectly valid solution. This is why it is still supported by many applications.
Enter man. Today, we need to pass around more than just clips on a timeline:
- Tags and other Metadata
- Color information
- What we had for breakfast
- File versions, proxies or intermediaries
The bottom line is, we need a system where we can send a lot more data than what the EDL was designed to accomplish.
Tags to Riches
A file might look like it has English characters, but in reality it’s just 1s and 0s all around. How will a computer know where a word begins or ends? Also, how will it know what to do with a group of 1s and 0s that you have ‘shepherded’?
Enter the tag. You are already aware of one standard that uses tags. You are reading its effect right now, and it’s called HTML.
E.g., if I want to write some text in bold and italics, I’d so something like this – <b>Make this bold</b>, but <i>Make this italic</i>.
This is displayed as Make this bold, but Make this italic.
You have a beginning tag and an end tag. The tag does two things:
- It herds data into groups of words,
- It also tells the computer what to do with them at the same time (like make it bold or italic).
HTML is a standard that isn’t allowed to change much. You can’t create your own tags like <Make this fly> or whatever (though some people try). This is enforced so that different browsers can display a web page the same way.
The powers that be decided that the concept can be used to wrap data into tags for delivery between computers. This serves two purposes:
- A human can read it.
- A machine can read it, too.
E.g., you could create a tag called <My secret name for this file> and add that to your file. Another human would know what you’ve written, and a computer can be programmed to understand the same as well.
Since HTML was holy, a new standard was devised, called the Extensible Markup Langugage, or XML. The word ‘extensible’ announces to the world that you can extend the number of tags, which means you can create your own – just like the English language.
You see trouble down the road, don’t you?
In Part Two we’ll see how Final Cut Pro XML is structured, the two types of Final Cut Pro XML, and why there are problems (as if you didn’t know that already).