Home > Resources > Articles > Custom ECM Solutions > Documentum Virtual Documents

Documentum Virtual Documents

December 3, 1999 by Michael Trafton

Virtual Documents allow your documents to be represented in a hirerarchy (like chapters in a book). This article introduces Virtual Document concepts and also goes into advanced features such as Early and Late Binding.

What is a virtual Document?

A virtual document is a document that contains components or children documents. A virtual document is similar to a folder. When you open a folder, you see the objects inside that folder. When you open a virtual document, you see the objects inside that virtual document. The main differences between a folder and a virtual document are that a virtual document can have content (a folder cannot), and a virtual document can be versioned-version 1.0 can have three children while version 1.1 can have five children and version 2.0 has no children at all (in which case it is just a normal document).

A virtual document is not a special object type-any dm-document or subtype of dm-document can become a virtual document, including custom object types you have created.

What is a component?

A component is simply a child object of another virtual document. A document can be a component of many different virtual documents. A component can itself be a virtual document (in the same way that a folder can contain other folders). A component can have content, can be versioned, and can be linked to multiple folders. It behaves like a normal virtual document, because it is a normal virtual document.

Why do I need Virtual Documents?

Virtual documents are used primarily in three ways: when a document is authored by multiple individuals, when a document is made up of multiple file formats, and when the order of the children matters.

When a document is large and has multiple authors, it’s a good idea to make the document a virtual document. This way, each section of the document can be a component of the virtual document. Since each component is a Documentum object, it can have its own owner, security, approval workflow, and other characteristics.

For example, imagine a report with three sections: an overview, written by Bob; and findings section, written by Susan; and a conclusion, written by Paul. Each of these authors must report to Sally, manager of the group, who has ultimate responsibility for the report.

With all three authors working on the report simultaneously, it would be impractical for all three sections of the report to be contained in a single file. When Susan needs to work on the findings, Bob might already have the document checked out as he works on the overview. The solution is to make the document a virtual document. In this case the virtual document would be comprised of four separate objects:

  • Head or Parent Virtual Document – This is a contentless object owned by Sally. It doesn’t need to have content because it exists solely to organize the components
  • Overview – Microsoft Word document owned by Bob
  • Findings – Microsoft Word document owned by Susan
  • Conclusion – Microsoft Word document owned by Paul

Since each component is its own document, it can be checked in, checked out, and routed for approval indendently from the other components. Authors can view one another’s components to synchronize their, but components remain under the control of their owners. This aspect of virtual documents makes them ideal for collaborative authoring environments.

Virtual documents are also useful for managing the integration of multiple file formats. For Example, imagine that you are writing a report in Microsoft Word. It is a short report and you are the sole author, but you want to include some tables (e.g. an Excel Spreadsheet) in the report and a Power Point presentation as an appendix. The report is really a single logical document, but it comprises three heterogenous file formats. Virtual documents allow the file structure to mirror the logical of the document, regardless of the file formats involved.

Virtual documents are also useful because of their ability to manage the ordering of component files. This is different from the ordering of files in folders, which typically is a sorting by name, size, modification date or other feature. Virtual documents support component ordering as a separate, author-definable characteristic of component documents. The component files of a virtual document cannot be sorted-they always appear in the order defined by the creator of the virtual document. This feature is particularly useful for the management of component files in a large project, such as the chapters of a book. With a conventional folder-based grouping, the sorting that orders the files does not follow the logical structure of the document. With a virtual document, the author retains control over the logical ordering of component files, and changes in the placement of existing or new components is reflected in a re-ordering of the entire virtual document.

What is the difference between a virtual document and a compound document?

A compound document is a specialized type of virtual document. It is created by Documentum’s client applications (WorkSpace and SmartSpace) when you import a document that is liked to other documents.

One of the powerful features of Documentum is its ability to identify and utilize OLE links in Microsoft Office Files as you import them. For example, if you import a Word file that contains a embedded link to an Excel file, WorkSpace will detect this compound document structure and will resolve the OLE link. WorkSpace will then automatically import the files that are linked to original file.

When WorkSpace detects that a file has OLE links to other files, it creates that file as a virtual document and imports the referenced files as components of the virtual document. When this happens, the document is known as a compound document, a special type of virtual document.

Because the order and structured of the links is set inside the file being imported, you cannot change the remove or change the order of the components of a compound document. If you wish to do this, you must edit the parent document and reorder or delete the references inside the file. When you check it back in, WorkSpace will remove or reorder the children of the compound document. You can, however, edit the content of the children of a compound document independently of the parent.

How do I create a virtual document?

Creating a virtual document inside WorkSpace is easy. Just select the parent document and choose the VDM/Edit VDM menu item. The selected object will appear in its own special virtual Document Manager window. To add components to this document, simply drag them from any Workspace window and drop them onto the parent document. They will be added as children. Be sure that you have checked out the parent virtual document before trying to add any components. Also note that you can add the same component several times.

Fig. 1


Figure 1:
Creating a Virtual Document within Documentum Workspace

To change the order of a component, drag it from its current location and drop it onto its new location. A line will appear indicating the target location of the component that you are moving.

Fig. 2


Figure 2:
Reordering Virtual Document Components within Documentum Workspace

To remove a component from a virtual document, select it and choose VDM/Remove from VDM menu item (also located on right-mouse-click pop-up menu).

What is an assembly?

An assembly is a snapshot of a virtual document and all its components. Because a virtual document’s components can be checked out and modified independently of the parent document, the overall content of a virtual document can change without that change being reflected in the parent object’s version tree. Creating an assembly takes a picture of how the virtual document looks today.

You may also freeze an assembly, which sets a flag on all the components of the assembly, preventing them from being modified.

What are APIs, Objects and Attributes used to manage virtual documents?

When you create a virtual document, many tasks that are managed behind the scenes by the tools available in the docbase. These tools include number of APIs, attributes, and special objects that are used to create and maintain a virtual document.

APIs

To create a virtual document, you use the appendpart API. This API has two main arguments: the object ID of the parent and the object ID of the child. Here is the syntax:

                
appendpart,c,document_id,component_id,{,version_label}

            

You issue this command once for each component you are adding to the virtual document. When you are finished adding components, you must issue the save API against the parent object (or check in the parent if it is checked out). The changes to the virtual document are not committed until the parent object is saved.

Other APIs used to manipulate virtual documents are described below:

  • removepart: Removes a component from a virtual document
  • insertpart: Inserts a component into a virtual document
  • asssemble: Created an assembly of a virtual document
  • disassemble: Destroys an assembly
  • freeze: Marks the virtual document and its components immutable
  • thaw: Makes a frozen virtual document changeable again

Attributes

The main attribute of a document that has to do with virtual documents is r_link_cnt. This attribute contains the number of children that are linked to a virtual document. If r_link_cnt=0, the object is not a virtual document. If r_link_cnt is > 0 the object is a virtual document. Note that r_link_cnt will never have a value of 1, because the parent of the virtual document is also considered a child. So if you add one component, r_link_cnt=2. If you add a second component, r_link_cnt=3, and so on.

Also note that this same attribute applies to dm_folder, indicating how many objects are linked into the folder.

There is no attribute of a document to show that it is a component of a virtual document.

Object Types

Documentum uses a special object to keep track of all the virtual documents in the docbase. It is called dmr_containment, and it has 4 attributes: parent_id, component_id, order_no, and version_label.

The dmr-containment object contains pointers to the virtual document and its components. In this example, we will use a query of the dmr_containment objects to show how this works. Here’s the query:

                
select r_object_id, parent_id, component_id, order_no, version_label from dmr_containment where parent_id = '090f424b80001aea'

            
Fig. 3


Figure 3:
Results of dmr_containment Query

Note that the parent_id is the r_object_id of a specific version of the parent. This means that the components and the order of the components can change from one version of the parent to the next.

The component_id attribute is actually the i_chronicle_id of the component. This makes sense because the i_chronicle_id represents the entire version tree of an object, and we don’t know what version of the various components we want until we issue the query to gather the components up (for more on this, see Late Binding vs. Early Binding).

Late Binding vs. Early Binding (what’s binding anyway?)

One the most advanced abilities of virtual documents is the power to specify which version of a component is returned when you query for the children of a virtual document. This ability depends on the features of Early Binding and Late Binding. To illustrate this, lets consider an example.

Imagine that your software company has a virtual document that represents the owners manual for your main software product, a web browser. The virtual document has 4 components: a letter from the president of your company, an installation guide (Chapter 1), instructions for using the product (Chapter 2), and an index.

Fig. 4


Figure 4:
Software Manual Example

The letter from the company president is frequently updated to inform customers about new products and other enhancements the company offers. The other components are updated to reflect changes in the browser as well. A unique version of each component of the owners manual exists for each version of the browser that the company has released.

Now suppose that you receive a call from a customer who needs a copy of the owners manual for an older release of the browser. The most current browser is version 3.0, but this customer needs the manual for version 2.1. At first glance, this task seems simple enough – after all, you created an assembly way back when to serve as the owners manual for browser release 2.1. But wait – that assembly is a snapshot of the virtual document from over a year ago, including a letter from the president touting “new” products that are now outdated. You need to send a version of the owners manual that includes the older versions of the chapters and index while including the most recent version of the president’s letter. Documentum offers a solution: Early Binding and Late Binding.

When you add a component to a virtual document, Early Binding lets you specify the version of the component to be retrieved when the parent document is assembled. In the case of our example, when adding the letter to the manual, you could early bind it to retrieve as the most current version whenever the letter is requested by a parent document. The version label to bind to is an optional argument to the AppendPart API. When you use early binding, it supercedes any late binding you might use.

Late Binding allows you to specify the version of components at the time you are ready to gather them into an assembly. This is done when you issue the DQL query. In the case of our example, you would ask the docbase to return the components that were used for release 2.1 of the browser. The docbase would identify these versions by the symbolic version label “Release 2.1″ you added when you checked in the components a year ago. It would then return the Release 2.1 version of all components except those that have been designated exceptions through Early Binding.

When you issue the query to gather up the components of the virtual document, you would use late binding to get the “Release 2.1″ version of the components. But since the letter from the President was early bound to use the CURRENT version, the CURRENT version of the letter will be returned even if there is a “Release 2.1″ version.

Using DQL to Gather the Components of a Virtual Document

Issuing the AppendPart API to create a virtual document is only the first step to using virtual documents. If you ever want to see these components, you have to query for them using DQL.

DQL has a set of reserved words just for dealing with virtual documents. Here is a list of them:

  • IN DOCUMENT: Specifies the ID of the parent virtual document
  • WITH: Specifies which version of each component to use
  • IN ASSEMBLY: Specifies the ID of the assembly (if any)
  • USING ASSEMBLIES: Tells the server to query an assembly of a virtual document
  • NODESORT BY: Differentiates between versions of components when multiple versions satisfy the WITH clause
  • DESCEND: Tells the server to recursively search through any sub-virtual documents

IN DOCUMENT and WITH

The two main DQL operators are IN DOCUMENT and WITH. The IN DOCUMENT operator specifies the object id of the parent virtual document. WITH functions like a where clause that determines which version of each component is returned.

The basic syntax of a DQL query into a virtual document is:

                
select r_object_id, object_name from dm-document in document ID('object_id>') with any r_version_label="CURRENT"

            

This is the exact same query that is used by the WorkSpace VDM. It asks the docbase to return the CURRENT version of all the components of this virtual document. Note that all virtual document queries will return the parent of the virtual document as the first item returned. Also note that you cannot sort the results of this query. Components are always returned in the order in which they were created.

Because the IN DOCUMENT clause takes an ID literal as its argument, you must use the ID Operator to convert the ID string to an ID.

DESCEND

If any of the components are also virtual documents, you can use the DESCEND option to return their components as well. When doing this, it is useful to include as well. When doing this, it is useful to include DEPTH in the select list. This will tell you how deep in the tree each component is. For example, use this query:

                
select r_object_id, object_name, depth from dm-document in document ID('<object_id>') DESCEND with any r_version_label="CURRENT"

            
Fig. 5


Figure 5:
Results of Query with DECSEND Option

NODESORT

You do not have to use r_version_label in your WITH statement if don’t want to – you can use any attribute or combination of attributes. However, multiple versions of an object can share all attributes except r_version_label (the server enforces unique r_version_labels to ensure uniqueness across versions of an object). This means that a query based on a common keyword will find multiple object versions that satisfy the query. The server will not know which one you want and by default will return the version with the lowest r_version_label. This is not likely to be the version that you want. However, you can use the NODESORT command to give the server a hint about what version you really want when it encounters multiple version that satisfy the WITH clause.

Here’s how it works: if the server detects that multiple versions satisfy the WITH clause, it will sort those versions using the NODESORT clause, and it will return the top item. For example, take the following virtual document:

Fig. 6


Figure 6:
Example Virtual Document

The query that we will issue to gather its components will look for only those components that contain the word ‘computer’ and one of its keywords. Now let’s look at all the versions of the object named “One”.

Fig. 7


Figure 7:
All Versions of the Object Named ‘One’

There are two versions of this object with a keyword = ‘computer’. Which one will the query return? We can use the NODESORT option to make sure that it returns the one that we want. For example, if we want to get the most recently modified component we would use the NODESORT option like this:

                
SELECT * from dm_document IN DOCUMENT ID('<object id>') WITH ANY keywords='computer' NODESORT BY r_modified_date DESC

            

This will return the most recent version of the object that has ‘computer’ as a keyword, in this case, version 1.2.

Special Attributes

When querying a virtual document, you will have access to some special attributes that only make sense in the virtual document context. I have described them below:

  • DEPTH: Shows how deep a component of a virtual document is in the VDoc tree
  • CONTAIN_ID: The r_object_id of the dmr_containment object associated with the components of the virtual document

6 Comments

Ganesh Kumar


Hi Michael,
I read your documentation on “Documentum Virtaul Documents” and it was very useful for me.

In the “Late Binding vs. Early Binding (what’s binding anyway?)” heading of the article and you have given an example to differentiate between ‘Late Binding’ and ‘Early Binding’ and it was fine.

Clarification needed on the scenario given in the article, you are doing ‘late binding’ to the “letter from the President of the company” and ‘early binding’ to the “component to be retrieved when the parent document is assembled”.

But, we SHOULD NOT specify the VERSION of components to be retrieved as you mentioned on ‘Late Binding’ topic. It is applicable only to an “Early Binding”.

I am kindly forwarding it to you to re-visit the article and please correct me if I am wrong?

Please do send me an emil, if anything.

Thanks and Regards,
Ganesh Kumar



Ganesh Kumar July 23rd, 2009 5:31 am

Michael Trafton


Ganesh,

A couple of thoughts. In the article, I’m not suggesting that you use late binding with a Version, I’m suggesting that you use it with a version label called “Release 2.1″ – that stands for the release of the software that the manual is about (not the document’s version number). Reading this now, I see that there could be some confusion – I should have used a different example.



Michael Trafton July 23rd, 2009 8:06 am

Akhil Lodha


Can you please suggest how to collate the contents of virtual documents? Kindly suggest.



Akhil Lodha December 11th, 2009 12:52 am

Cristian CIORNEI


(Required)Exactly what I was looking for…
stil, how to reshape the queries to use more friendly attributes like: object name.
Or, how to list all the components of all the virtual documents in a specific folder?



Cristian CIORNEI July 7th, 2010 8:05 am

Andrea Merkle


Is there any possibility to create a working weblink to a VD without content?



Andrea Merkle October 20th, 2011 4:51 am

Trackbacks

  1. Content Assembly: Where IA and CMS Meet