What are Assembly Edits and why they are are important

A deep dive into how Assembly Edits (aka rough cuts or paper edits) are used in video editing

When it comes to spoken content, assembly edits are the backbone of what we do as editors. Story is king, and for the vast majority of stories (news flash, filmmakers: they aren’t all Hollywood movies), soundbites are the driving force.

Sometimes we’re just a lonely editor slogging through a project on our own.

For every other project where you’re not on your own, the bones of the story need to be assembled and agreed upon before the meat should be added. But hours of interview footage can be woven a number of ways, and often the editor is left as the sole steward, ripping out version upon version until everyone is satisfied. Assembly edits have long been the simple, straightforward tool to more quickly reach a consensus among the team, and straightforward tools have a tendency to remain unchanged over the decades. That is, until someone does something crazy like slapping an engine on a bicycle. So let’s take a look close-up into what assemblies have traditionally been and how Simon Says Assemble is adding some horsepower to the editing process.


First, there are plenty of terms used to refer to the many stages of this process. Depending on the context, these terms can be used interchangeably, but it’s important to have some baseline moving forward.

  • Selects: Also called Stringouts, this is one of the most general terms used in filmmaking for culled or selected media strung together in a sequence, either chronologically or in a particular order. This can be any type of media such as relevant soundbites in an interview, notable moments during an event, or great bits of broll. Transcriptions of selects are sometimes used to aid the paper cut. I like to think of stringouts being the vehicle for selects, but again, I‚Äôve always seen these terms used interchangeably.
  • Paper Cut: The document or script that details how a basic cut should be put together. It references a series of moments from several clips alongside those clips‚Äô names and timecode. Used by those who need to build the story but can‚Äôt edit the media directly. These are built by reviewing stringouts from the editor, but still need to be converted back to a sequence by the editor.
  • Assembly Edit: Also called an A-Cut, an assembly edit is the most basic, no frills form of an edit. It‚Äôs meant to be cut together quickly, without music or broll, to keep labor time down for the editor while revisions are made. An assembly edit is much easier to convey how a video will feel since you can see and hear how all soundbites play together.

These three terms can also be considered their own stages of forming a story, each with a place and purpose. Overall, this process hasn’t changed since the beginning of time-code (see what I did there?) when we were turning dials on videotape decks. But just getting to the assembly cut can be a lengthy journey, and while many have tried to carve shortcuts in the process it ultimately does little more than shift the workload rather than lessen it.


  • Stage 1: Prepping Your Media

First, you need to build stringouts of your content. If you’re planning on having an interview transcribed, make the decision now if it’ll be worth pulling selects of your spoken content. Often you may want to just submit an interview in its entirety for transcription and instead focus on pulling broll selects of footage you can cover the story with. Other times, you’re gonna want to cut out content if there ends up being a lot of start/stops or filler time.

In general, stringouts need to be:

  • Concise, with all the fat and noise removed. You don‚Äôt want a bunch of silent heads waiting between takes or false starts in your selects.
  • Technically aware, able to point to where the content comes from in relation to the original raw camera media. This is often done with a timecode burn-in that references the source timecode and the source filename.
  • Agile, small enough to be uploaded/passed off anywhere and to anyone. No need for 4K or HD resolutions here; 360x180 should do fine.

Remember that in this stage, you’re simply collecting and converting your media into something that’s easily digestible for building a paper cut.

  • Stage 2: The Paper Cut

Now it’s up to the producer, client, whomever to build the bones by reviewing the spoken content from video clips, transcripts, or both. For obvious reasons, transcription is highly recommended for working with spoken content. But, if your workflow forgoes transcription to eliminate the time and service cost of transcribing, you’ve really just shifted that burden to the producer who’s still having to type up what’s being said.

The upside with building from transcription docs is that it’s often much faster to scan text and search for specific keywords/phrases. Then you’re just copy/pasting soundbites into your document. But the downside? The editor will still have to hunt within the timecode range for the correct soundbites, as traditional transcriptions just give you a block of text with the in and out source timecodes, much like this:

Another drawback with paper cutting is that text cannot reveal a subject’s inflection. When putting soundbites together, or even splicing sentences together (what we lovingly refer to as franken-biting) what might look good on paper doesn’t always translate well on-screen, even if you have the added benefit of video stringouts.

  • Stage 3: The Assembly

Now for putting it all back together. It’s really simple, and designed to be that way. Take what the script is referencing, find the clip, find the timecode range, set in/outs on the actual soundbite, and drop it in. Rinse and repeat a couple hundred times, depending on your project’s length. Export and post for review. Changes? Make the change either from direct notes or an updated paper cut. Export and post again for review. Simple, but let’s be real… it’s monotonous.


I’ve broken these three stages down pretty simply, but what if we could merge some of these together? What if the assembly cut was being assembled as we built our paper cut? Simon Says Assemble proposes to do just that.

Simon Says Assemble
Simon Says Assemble

Simon Says began as an AI transcription and translation website with NLE extensions for audio and video workflows. They launched Simon Says Assemble in 2020 with a simple highlight-and-drag interface that allows you to:

  • Select the soundbites you want
  • Drag to reorder bites
  • Preview the assembly as you build it
  • Share and collaborate with others
  • Export it all back to your edit

The whole reason the traditional process has existed for so long is simply because it’s not feasible for everyone in the post-production process to sit with an editor and hold their hand from the first cut through the fine cut. It’s not expedient. Thus, all media needs to be re-translated for a world without NLE’s to allow for others to participate. Which, is also not expedient I guess, but less so than the alternative. That is, until Simon Says Assemble came along. By lowering the minimum requirements for access, anyone on your team can start cutting the assembly edit right away, not just the assistant editor.

Collaboration is also possible with built-in share and commenting tools to further bring your team together.

Feedback is attached on the timeline

The time cost and financial cost of using Simon Says Assemble is easy to see. There’s a Pay-As-You-Go plan as well as subscription plans that include credit for transcription. It works out to pennies per minute, is swift and accurate, and lets you dramatically save time on paper cuts and assembly edits.

Case in point, when creating talent reels for development projects, I often send timecoded stringouts to a producer who watches them over and over, and then hand-builds a papercut, transcribing the interview himself as he goes. I then take his paper cut and convert it back to an assembly cut, and then we have a couple rounds of back-and-forth dialog as we fine tune just the basic A-Cut before I ever start covering the cut with music and broll. For a standard 1-hour interview with the talent, this process usually takes 4–5 days to get us to picture lock so we can quickly color correct and send on to the rest of the development team (a combined total of 40–60 man hours).

Since we’ve started using Simon Says, our time has been drastically reduced. Papercuts now only take a day, as we can preview how they’ll look and sound in real-time as we go, and importing an XML does 90% of the work rebuilding soundbites on my timeline into an assembly cut. And since Simon Says Assemble exports to all major NLEs, I can start my media in Final Cut, upload to Assemble, and then export to Premiere Pro. What’s more, the time saved in the beginning I can now pass on to my clients or spend on more enriching things like sound effects and color correction and grading for the final film.

Easily export from Assemble to your NLE

As a real-world editor working on real projects, Simon Says Assemble has completely transformed how I work, and for the better. See how it can change your workflow. Go to and get free transcription credit when you sign up.

— Michael Cummins