What Ever Happened to Metadata?

Creative Community Conversations

What Ever Happened to Metadata?

Walter Soyka replied 10 years, 1 month ago 20 Members · 96 Replies

Aindreas Gallagher
May 22, 2016 at 11:33 pm

yes exactly – to be clear – I was just talking the mickey with that batman stick around until you’re the villian/troll quote – that one applied directly to me and many others. I don’t check this place as often as I did, but Simon is one of the people who I would turn up to read. He’s not Batman, he’s an essentially genteel Islington Spock as it were. Unpick that one yanks. He’s probably Corbynite. That’s a limestone mineral.

Anyway – not here that often and I wasn’t here. Please ignore the stupid gag. That said – Master level tag search HUD for editors is a mad wheeze though? Wouldn’t it demand some memory mind you?

I wonder why all Derren Brown scale card counting memory feats involve trained mental spatial arrangement of items. In physical rooms even. I wonder if that practice is important and applicable to critical non-linear editing decision making at the sharp end, where you have trained yourself into a, oh I don’t know… spatial mind palace of discreet footage location compartments for recall. That can’t be right surely. Why did we replace a command line search string with memory based spatial drag and drop MacOS finder again? Three decades ago? I’ve got to be wrong there.

ANYWAY. It’s not David Lawrence material there God knows but it’s at least a small pile of verbiage for you to mercilessly unpick Ubsdell? I do buy spatial recall though. I think it’s as old as the hills and it has form.

https://ogallchoir.prosite.com/
producer/editor.grading/motion graphics
Walter Soyka
May 23, 2016 at 4:37 pm

[Simon Ubsdell] “I was hoping this would lure Walter S. out of his batcave”

I laughed out loud. Thank you. I hope you don’t mind that I’ll jump straight into the weeds with a wall of questionably-topical text.

First, a quick definition. Metadata is data about data.

I see one big barrier to broader adoption of metadata in “average Joe” workflows: the desktop mindset we computer users have adopted over the last 35 years of personal computing. We have trained ourselves to think about documents and files and folders, and we’ve built this into the way our applications work and the way our filesystems work. Now, because our applications encourage us to think of documents as files and our filesystems encourage us to think organize our document-files hierarchically, we do.

We are accustomed to working directly with our data. We are less accustomed to working with the data about our data, but there are some notable exceptions (or beacons of hope) that I’ll get to in a minute.

I think Jeremy hit the nail on the head when he talked about the transition away from “the desktop/Finder environment.” The first thing that I think we need to do to embrace metadata-driven workflows is stop thinking about files, period.

Let’s look at “file-based workflows” for a moment. What do project files, source media files, proxy files, render files, waveform peak files, subtitle files, XML files, and deliverable media files actually have to do with our work?

Absolutely nothing.

A file is a data abstraction from the computing Stone Age. It’s a metaphor that made it easy for us to move from working on paper documents to working on electronic ones. This made sense in the 1980s when a computer with word processing software might replace a typewriter, or a computer with spreadsheet software might replace actual paper spreadsheets. Computers allowed us to launch a vicious cycle of increasing complexity; it’s easy to create new documents when we need them, so we do, and this in turn creates more need for more documents.

In analytics circles, there’s a lot of talk about the three Vs: volume, variety, and velocity. We have more data than ever before, it comes in lots of different types (both structured and unstructured), and the pace at which we generate new data is continually increasing.

This isn’t limited to Big Data — this is happening on our desktops, too, and our Stone Age hierarchical filesystems don’t give us the tools to manage this.

Firstly, manual organization of increasing volume is doomed to failure because it is labor-intensive; look at Google’s algorithmic web search versus Yahoo!’s original manual directory of websites, then apply the same concept to our increasingly complex daily work.

Secondly, variety doesn’t exist in most filesystems. A file is a file, and outside of a couple of special attributes like being directory or an executable, that’s all there is to it. Images, movies, music, word processing documents, spreadsheets, etc. are all the same to the filesystem. The tools we have for working on these objects are completely unaware of the contents. We can move/copy/rename/delete, but we largely cannot understand the contents of a file or operate on the data within it, without opening it in the application that created it.

Finally, velocity just tells us that both of these problems will only ever get worse.

So what are the beacons of hope? The two most common cases where people regularly work on collections of objects, instead of working on objects directly: email and music. Looking for a document (!) you remember you got from Alice at the close of the last quarter? Trying to exclude holiday music and show tunes from your party mix? These are ridiculously hard when you have to work on the data directly, but very nearly trivial when you can use the computer to work on the data about the data.

What makes these cases easy and practical? As Simon, Oliver and Jeremy mentioned, standard schemas: from/to/subject/date, artist/album/track/genre.

I think that real, pervasive metadata in production is more likely to come top-down than bottom-up. Apple has given us some nice metadata tools, but no real way to communicate across systems. Adobe has done the inverse: XMP is a great way to communicate across systems, but aside from a few very cool features, they don’t drive metadata use in their applications. The three Vs of analytics haven’t overwhelmed most of us on our desktop systems yet. We can still cram a bit more into our brains before we truly have to build some automation on data about data.

But we’re not Netflix:

https://techblog.netflix.com/2016/03/imf-prescription-for-versionitis.html

https://techblog.netflix.com/2016/03/extracting-image-metadata-at-scale.html

We need new metaphors and new abstractions to break our outdated mindsets and help us escape our chicken-and-egg dependence on the file metaphor for data. We need more apps like FCPX which do not require you as a user to think about files in your daily work. We need new standards like IMF, and we need to make better use of existing standards like XMP, in order to break down the walls between applications.

We also need a good understanding of the problems we want to solve, so we can create the right metadata architecture to support our needs.

So what problems do you want to solve?

Walter Soyka
Designer & Mad Scientist at Keen Live [link]
Motion Graphics, Widescreen Events, Presentation Design, and Consulting
@keenlive | RenderBreak [blog] | Profile [LinkedIn]
Bill Davis
May 23, 2016 at 7:00 pm

[Walter Soyka] “We also need a good understanding of the problems we want to solve, so we can create the right metadata architecture to support our needs.”

Another LOVELY Walter S. post.

Dude, I miss chatting with you. Come back more often.

Re: the above…

As most of you know, I’ve been with FCP X from the start. I’ve tried to study stuff like taxonomy and tagging strategies – but even after 5 years, the fact that I don’t edit the same material, for the same clients, to the same ends, over and over again, means I have to constantly switch around even something as basic as my fundamental tagging strategies searching for the best results.

Some things have been consistent and profound. Like the joy of using the REJECT tag to remove the pure crap from my field footage early, so that it never enters my field of view unless I need to mine it for something.

And many sorting and display strategies I’ve developed to help express how I think about organization, onto my asset searches.

But I feel every day that I can constantly improve on my metadata tagging and access strategies JUST in terms of how I apply keywords to my clip ranges. And this doesn’t even begin to touch at ALL on the types of universal “across whole systems” metadata strategies that Walter discusses in this post.

The only thing I’m totally sure of, is that I’m decidedly NOT going to get bored between now and when my active editing time someday ends!

And so it goes.

New signature under construction and coming soon. Please stand by…
Simon Ubsdell
May 24, 2016 at 12:07 pm

Fascinating post! Many thanks, Walter!

I do wonder whether the Big Data question is fully germane to this discussion though. The problems of Big Data are problems of analysis, whereas the challenges we are talking about here are a lot more humdrum – namely, how to find stuff.

The question “”Where is thing X?” is a lot easier to solve than the question “What do things A-Z mean?”

I would suggest that the problem of how to find stuff has been answered very impressively, but just not so much in our industry.

To take the example of Google – a pretty convincing example of an answer to the question of how you find stuff – the solution in its crudest form is fundamentally simple. When we use Google, “all” we are really doing is a string search; in other words, we are asking for a match to a set of characters that we type into the search field.

What makes it clever is the “AI” component. It can cope with “fuzzy” matching (mistyping on my part, approximate matches, etc.), which is obviously important, but Google also “knows” what I might be wanting to type (autofill), it “knows” what I have typed before, or to put it more grandly it “knows” the things that are important to me and the things that are less important, which means it can display the things that are important to me with a higher ranking than those which are not.

Of course, when we talk about this in the context of Google, it seems like rocket science, but there’s an app that I use every day that also knows useful things about how I use my computer, and that’s Alfred. (If you haven’t used Alfred, you are really missing out on a fantastic productivity resource.)

So how about we just ask our developers to give us “intelligent search”?

Would that work for you?

(Of course, intelligent search still relies on more or less intelligent “labelling”, but that’s another question. Again, I think the answer to that one is relatively simple, but I won’t go into that here although it’s been touched on elsewhere in this thread.)

Simon Ubsdell
tokyo productions
hawaiki
Oliver Peters
May 24, 2016 at 12:13 pm

The trouble with searches, though, is that you have to know what you are looking for. It completely negates what I would call the “proximity” effect. In other words, finding something you didn’t know you needed, because you spotted something in the way to going for what you originally wanted. It was the thing close by but not the original target. The equivalent of shuttling through footage and finding the perfect shot in that process.

Oliver

Oliver Peters Post Production Services, LLC
Orlando, FL
http://www.oliverpeters.com
Simon Ubsdell
May 24, 2016 at 1:00 pm

[Oliver Peters] “It completely negates what I would call the “proximity” effect.”

Does it, though? Could we not have the benefits of both? Having more efficient search doesn’t mean you couldn’t still pick through a random pile of stuff if you wanted or needed to.

And yes, I completely agree about the value of serendipity to the editing process – Serendipity is a truly genius editor and we do well to trust him/her.

In my experience, editing serendipity comes to your aid more effectively when you haven’t over-compartmentalised your material and/or your workflow. I have watched too many editors organise their material to death and lose all the random benefits of a looser organisational structure.

In a sense, my argument for better search is also an argument for more relaxed, less heavy-handed organisation.

Simon Ubsdell
tokyo productions
hawaiki
Oliver Peters
May 24, 2016 at 1:04 pm

[Simon Ubsdell] “In a sense, my argument for better search is also an argument for more relaxed, less heavy-handed organisation.”

Agreed. That makes sense. Ultimately we are all trying to reduce the amount of stuff in the bucket that we have to sift through at one time, without hampering the actual sifting process.

– Oliver

Oliver Peters Post Production Services, LLC
Orlando, FL
http://www.oliverpeters.com
Michael Hancock
May 24, 2016 at 1:22 pm

[Simon Ubsdell] “In my experience, editing serendipity comes to your aid more effectively when you haven’t over-compartmentalised your material and/or your workflow. I have watched too many editors organise their material to death and lose all the random benefits of a looser organisational structure.

In a sense, my argument for better search is also an argument for more relaxed, less heavy-handed organisation.”

That’s one of things that I’ve come to like so much about FCPX . You can have the best of both worlds.

You can drop all of your footage in an Event, then compartmentalize it into keyword collections to organize it to death. This is useful to me because if I know I’m looking for some b-roll of a boat I can go directly to the keyword collection “Boat” to very quickly narrow my search, without having to remember that I should search for “Boat”. I just twirl down the event and I have a list of keywords to choose from. I find this very helpful in the event that I have to hand off my project to someone else, or revisit it in 6 months. I don’t have to remember exactly how I tagged everything so I can accurately search for stuff. And they can, at a glance, find very specific groups of files without having the learn how I tagged everything (VO, Music, Graphics, Lower Thirds, Logos, Pictures, etc…).

But anytime I want to look for a file in context of all other files I just select the Event and all that organization goes away. At that point, it’s just a massive bit bucket. Add in a Smart Collection to automatically start grouping stuff based on what you’re working on and it makes it relatively easy to find your clips while still giving you a general overview of all of your material. Now as I’m working and becoming more and more familiar with the material the need for detailed organization goes aways, but again – someone new to the project would benefit from it. Or revisiting it months down the road.

—————-
Michael Hancock
Editor
Walter Soyka
May 24, 2016 at 1:27 pm

[Simon Ubsdell] “So how about we just ask our developers to give us “intelligent search”? Would that work for you?”

I think intelligent search is metadata-lite. Finding the piece of footage I need is only one small problem we can more easily solve with metadata. I’m actually most interested in the analytics(-lite) potential in media metadata.

What if we broadened our concept of metadata beyond a simple structured description of unstructured data (i.e., keyword-tagged ranges of video), and tracked and stored information about the work we do with our media, so we could query and analyze it?

Let’s say I’m working on a set of deliverables for a specific client, and I am tasked with making an update to their logo. If the system as a whole understood more about the project and the use of media elements, what could it do for me?

The system could tell me what dependencies of that logo are affected by the change: an animation, a video, a brochure, a web site. It could understand the entire dependency graph and know that every video that used the logo animation should be updated, and know that every version of the video should be updated.

It could tell me how much time I spent working on the logo change, and also how long all the dependency revisions took as a result of the change, across my entire team.

It could know what other people I had shared the old logo with, both in my company and externally, and allow me to keep them up-to-date with this change.

It could know that this kind of change is typical for this specific client, but unusual for my clients overall, and offer me insights that might affect my future bids for this client.

It could understand version control, so that when the client inevitably asked to change the logo back to the original, I could roll back the change across every dependent deliverable.

I have data for my work. I want metadata for my metawork.

Walter Soyka
Designer & Mad Scientist at Keen Live [link]
Motion Graphics, Widescreen Events, Presentation Design, and Consulting
@keenlive | RenderBreak [blog] | Profile [LinkedIn]
Simon Ubsdell
May 24, 2016 at 2:02 pm

OK, so this would be the second part of my plan for a simpler, more efficient world, although strictly speaking it’s the first part.

The best, most efficient place to store metadata is the most obvious and easily used.

And that’s the filename.

If instead of putting your file inside a folder called Boat inside whatever application it is you are using, you rather added the characters B-O-A-T to the filename (outside any specific application), you are immediately at a very considerable advantage.

The first advantage is that the metadata travels with the file wherever it goes, so whoever is accessing the file in whatever application immediately knows what you want them to know about it. This is vastly more useful than metadata added inside a specific application which can almost never usefully travel outside.

The second advantage is that you can now access your Boat files instantly within your application, assuming it has a simple search function (which almost everything does).

Typing BOAT into a search field is always going to be faster than clicking around to find a “folder” with your boat files inside it. Typing is always faster than clicking. Always. Operating any application from the keyboard is always faster than using the mouse or pen.

The third advantage to metadata stored in the filename is that it’s human-readable, which obviously confers its own benefits. Even a producer looking at a folder of files in the Finder now knows their contents without having to open them in any application.

There are plenty of ways to edit the filename (and batch edit multiple filenames), but my current favourite is Kyno which has a set of very elegant options for how the original filename is handled and how the new characters are added.

Sometimes the most powerful solutions are also the simplest.

Interestingly quite a few of our clients are now asking for deliverables to have specifically formatted filenames that “embed” a whole range of useful metadata, including date, frame size, codec, aspect ratio, track configuration, language, texted/textless and more. This system has obvious advantages for distribution of deliverables.

Simon Ubsdell
tokyo productions
hawaiki

Page 8 of 10

← 1 … 7 8 9 10 →

Reply to this Discussion! Login or Sign Up

Creative Communities of the World Forums