The Sogeti Smart Workspace offers a great platform for collaboration and document management (amongst others). But although working is such in environment is becoming more and more intuitive and easy, we must not underestimate the need for data retention and -management. In this article I explain the options for data retention. Working from the possibilities in the (not so distant) past to the new functions of Office 365.
Those of us who have worked with data retention, SharePoint and Office 365 know, that this is not easy. Content types, disposition workflows and record centers all make it into the mix. But this is all going to change within Office 365 and in this article, I want to give you some insights into this.
The reason for this is because the way we were used to work with data retention and (for example) SharePoint is changing in Office 365. I want to talk about the data retention possibilities in the classic world, and how we used this at a Dutch organisation. But I also want to focus on the advanced data governance possibilities of Office 365.
This article is made up of five sections:
- Content life cycle;
- Classic SharePoint retention and disposition;
- Use-case of classic SharePoint and an innovative solution;
- Office 365 a year ago (retention policies);
- Office 365 present (retention policies, retention labels and disposition overview).
Content life cycle
Believe it or not, but content does have a life cycle. Although, to be fair, not all organizations use every stage of this lifecycle. But when looking at retention and disposition of content, we do need a clear understanding of this lifecycle. And for this I use a classic model.
Classic document management
The life cycle of content starts with its creation. Let’s take a document for instance. A member of your teams need to write a business proposal. So, she opens Word, selects the required document template and starts writing. She stores the document on a SharePoint site.
During the process, she might ask co-workers to look at the document and change the document. Using automatic versioning and co-authoring, this is easy. These are the first steps in a document life cycle.
If needed, the document is reviewed and approved using a workflow and the document get a new status. For example: “Final” or “Approved”. As this is no longer a working-document, the document itself might be moved or copied to another location. These are the next steps in the life cycle of the document.
Classic records management
If and when a document is no longer valid or needed, it can be removed. But some content cannot be as easily removed. This content may need to be retained for a specific amount of time and be accessible when needed. Most organizations have policies in-place which detail what content needs to be retained, for how long and what needs to be done when the retention period has been met.
End the end of a retention period, the content can be removed (there are rules for this as-well). This is called disposition. Some content may have to be transferred to an official archiving body (for example, The National Archives in The Netherlands).
To be honest, a lot of organizations I know do not have any processes in place to support this life cycle completely. And we all know of organizations with 100+ TB of content on file shares, SharePoint (on-prem/cloud) sites, OneDrive, you name it. Most of the time, the records management part of the lifecycle is not addressed (fully).
Classic SharePoint data retention and disposition
As for SharePoint – it does offer some options to support data retention and disposition. Even in the on-prem world (dating back to SharePoint 2007) we could you a site-template called the Record Center. And this template is still available today.
If you want to support data retention and disposition using SharePoint on-prem or SharePoint Online using the “classic” options, you could use the Record Center template. And this will provide you with an out-of-the-box location for archiving and disposition.
In the classic SharePoint data retention scenario’s, we probably use functions like content types, information management policies and disposition workflows. These are also available in the Record Center template. The information management policy can be used from a content type or on the list itself. In which case, it applies to all content in that list.
One of the actions which can be added to this policy, is the so-called Disposition workflow. This workflow is used in disposition scenarios to control the removal of content when the retention period has been met.
Please note that I did not use the words “record management scenarios”. And that’s because this workflow cannot dispose of records. In a stringent record management environment, you will not be allowed to delete records. Unfortunately, this also applies to this disposition workflow. Kind of ironic isn’t it….. The only workflow designed for disposition, cannot work on records….
But that’s not the only drawback of the disposition workflow or the (out of the box) record management capabilities of SharePoint. SharePoint doesn’t provide the record management with any dashboard or information on content which has been retained or is to be disposed. And working with content types or lists and information policies is somewhat cumbersome. Let’s look at a real-life use case in which we solved some of these problems.
Use-case classic SharePoint disposition
In this use case, we worked with an environment which has an extensive amount of content in SharePoint Online. This content is stored in collaboration sites, project sites and document centres. These document centres are used for distributing, storing and publishing of official documentation which is created in the other sites.
More than one record centre was provisioned and these contained more than 50 record libraries. These were based on retention period. Every library was configured with a view depicting the retention period and running disposition workflows (if any).
This solution did not provide an accurate overview for the record managers. In addition, they needed an overview of to-be disposed content. This list was needed to inform the data-owners of the possible deletion.
So we came up with this innovative solution.
A record management dashboard was created. This dashboard showed all tasks related to the running disposition workflows. A “bulk-approve” button was added to quickly approve pending dispositions.
A PowerShell script (running in Azure) was created. This script enumerated all content which was about to reach its retention period. Information on this content (metadata) was then stored in a SharePoint list. This list could then be used as an overview of all to-be disposed content and could be filtered/sorted and exported to Excel – providing a historical overview.
I’m still very proud of this solution, especially the disposition overview part. And now it’s great to see that Microsoft has added a disposition overview dashboard in Office 365. So, let’s take a look at those options.
Office 365 a year ago
Should you have looked at Office 365 security & compliance at least one year ago, you would have found several retention policies for SharePoint and OneDrive for business. These policies allowed you to do (basically) two things for content in these platforms:
- Clean up (remove content based on the last modification date);
- Retain for a specific period, including “forever”.
This last option was also known as the (site) preservation hold option. A hidden and secure library was created an all content modified or deleted within the specified period was stored within.
The drawback of this options is, that you could not specify the content the policy for or let the user decide for themselves. The options also did not include any machine intelligence to determine if there was any sensitive content. But we did have retention policies.
Office 365 now
At the moment of writing this article, it’s July 2017 and data retention and disposition has received more attention by Microsoft. The data retention policies are still here (Data governance | Retention). But these have been fine tuned. But the new labels and label policies are brand-new and great!
You can still decide to clean-up your content by deleting this based on a modification date. And the preservation hold is still here as-well (just retain you content for a specific period).
Advanced retention settings now allows you to use detect sensitive content as-well.
This might be appropriate if you want to retain specific content which contains IBAN-numbers or passport numbers. Or you can create your own query for this.
But these are policies which work without any user-interaction or knowledge. And that’s where the labels come in.
Retention labels and label policies
Data retention labels add another dimension to this subject. These labels (not to be confused with Azure Information Protection labels) can be added to you content. For example: added to your SharePoint Online libraries.
A label can be applied by the user, or it can be automatically applied based on the policy settings. In a SharePoint library this looks like this.
And in Outlook Web an email message can be labelled by selecting it and using the Assign policy options. Very easy to do.
You never knew an email message had this much options, did you?
In a nutshell, this is how it works.
- Create a label (a label determines the retention/deletion actions)
- Publish the label (creates a label policy) to the required Office 365 locations (Exchange, SharePoint, Groups, OneDrive);
- Add one or more labels (if needed).
So, you need a policy and labels in order to get this to work. And creation of the label policy is done by publishing the label! By the way: the location can be all of SharePoint Online or a specific site-collection.
One of the cooler aspects is that this label is published throughout Office 365. So in SharePoint, you don’t need to add anything to lists or content types. You can even have a mandatory (or default) label. There’s a new library setting for this.
So you can have a library “HR – Job applications” with a default data retention policy of “HR – Job applications – retain for 2 years”.
You can add the label(s) to the view of a library. Which might come in handy. Also note the column “Sensitivity”. This column “reads” the Azure Information Protection label, if the document has one. Undocumented feature…
One thing I did notice. I wasn’t able to set information policies on libraries or content types any more in an Office Group (modern team site). The options weren’t there. An Office Group on the left and a classic team site on the right.
Remember the use-case where we created a custom disposition overview solution? Now Office 365 offers an out of the box function. All content which has reached the end of its retention period and needs disposition approval can be viewed from this dashboard. This dashboard is found at Data governance | Dispositions.
This really is cool and useful for all record managers out there. No matter where the content is stored, you can get an overview of content which is going to expire.
So, where does this leave us?
More and more information management functions are becoming available on a tenant-scale. No more site-collection based settings or needing a content type hub to provision changes in an information management policy. Policies are working cross-function (Groups, SharePoint, OneDrive, Exchange).
We now have a means to use machine intelligence to scan for sensitive information and base policies on that. It’s easier to use. People just have to select a label (basically). And we now have a disposition overview! Yeah!!
But Microsoft does need to work on their vocabulary. I mean: AIP labels, data retention labels, tags, etc.
More information on advance data governance and labels can be found here: