Microsoft turns data storage upside down

Enterprise plans licenses for OneDrive for Business give you unlimited data storage space. Business plans for OneDrive for Business give you 1TB of storage space. But SharePoint only gives the organization 1TB plus 10GB per licensed individual. This is the exact opposite of how data has been typically stored on-premises. On-premises, most data was stored in a general location on the server while smaller amounts of data were held in the user folders. So what’s the best way to massage your data so you can take full advantage of the storage space that is available in your subscription? Let’s see.

Understanding how SharePoint and OneDrive for Business are related
SharePoint and OneDrive for Business are linked. SharePoint is the data storage location and OneDrive for Business is the client that manages the sync process. That part is pretty easy to understand. But to confuse the matter, Microsoft gave OneDrive for Business the user’s own private storage space, which although it is stored in SharePoint does not draw from your SharePoint storage quota.

You can think of the OneDrive for Business personal storage location as the user folder from the on-premises world. SharePoint is the storage location where these “user folders” reside and can be thought of as the equivalent of the server from the on-premises world. “User folders” do not take up any of your SharePoint organizational quota.

This means that just like on-premises, your data storage will have two components. There will be private user data locations (unlimited or 1TB depending on your license type) and general corporate data locations (1TB plus 10GB per licensed person). As you can see, the cloud turns data storage on its head. In most corporations, the general data storage location contained the larger amount of data while the user folders contained the small amount of data. In Office 365, it’s the other way around.

In addition to syncing and storing your own private files, the OneDrive for Business client can also sync corporate data stored elsewhere in SharePoint. So this client provides access to files in both locations. Best of all you can choose what you “see” in your OneDrive for Business client and what you are going to sync locally to your computer.

To muddy the waters just a bit more, Microsoft recently announced that One Drive for Business will soon start to offer the option to automatically sync your local profile default data locations such as the documents and pictures folders. And it will also have one-button ransomware protection for your files.

So now we’re storing personal data, bits of the user profile, and we’re syncing locally some or all of the data in SharePoint. But we’re still upside down from how business has historically stored data because our corporate space is smaller than the personal space. Basically, if you want to store all of your data in Office 365, then you’ve got some reorganizing to do and some educating of your staff to do so they know where to store things now.

Thinking it through
Knowing that we have more space for private files than we have for general corporate data means that we have to think about how data is going to be stored in the cloud. Or you could purchase more SharePoint data storage space and not think about it. But let’s see how we might rethink data storage and convert it into the cloud model from the on-premises model of data storage.

To do this we’re going to first fix up our data to make sure that we have naming conventions that will be accepted in the cloud. Then we’ll look at archiving. Finally, we’re going to take a look at who really needs access to files and think about how modern applications and the cloud might mean we can organize them differently.

Ready to migrate files into OneDrive for Business?
You‘ll need to be aware of a few limitations when deciding to migrate your files into OneDrive for Business. The biggest gotchas for my clients have been file-naming conventions and total character length. But there’s also a file size cap and a few file types that aren’t allowed too. So you might need to do some data massaging before you migrate.

Here’s what you need to know:

These are the characters that aren’t allowed in your file names: <, >, :, “, |, ?, *, /, \

Like what you’re reading? Get the latest updates and tech guides in your inbox.

Enter your email address…
I understand that by submitting this form my personal information is subject to the TechGenix Privacy Policy.
These are the file names that aren’t allowed: Icon .lock CON PRN AUX NUL COM1 COM2 COM3 COM4 COM5 COM6 COM7 COM8 COM9 LPT1 LPT2 LPT3 LPT4 LPT5 LPT6 LPT7 LPT8 LPT9.
Any filename starting with ~$ or desktop.ini and anything with this string of characters _vti_.

These are the folder names that are not allowed: _t _w _vti_ and forms when it is at the root level.

Each file must be less than 15GB in size, which, honestly, should never be a problem. This is data file storage after all, not database storage.

The total file path must be under 400 characters. This one is likely to catch many people.

Fixing file names
I can’t do a better job at providing a smooth easy solution for fixing the file-naming conventions than Nik D’Agostino, product marketing manager at Lowry Solutions, has in his fabulous article on LinkedIn. So I’ve pulled this information from his article for you.

1) Download the Bulk Rename Utility Tool and extract it.

2) Uncheck all of the group except for 3 and 12.

3) Make sure folders, files and subfolders are selected under group 12.

4) Fill out group 3 with the characters you want to find and replace. I recommend replacing each of the following characters \ / : * ? ” < > | # % with a dash or space.

5) Navigate to the folder whose contents you want to rename (in this case the folder we copied to our desktop) in the left window pane then make sure you select all of the files, folders, and subfolders you want to rename by selecting them in the right window pane and click the Rename button.

6) Repeat this rename process individually for each of the following invalid characters: \ / : * ? ” < > | # %.

Archive your data
Many businesses are carrying around a lot of data that they really probably don’t need but can’t bear to part with. In my experience, this actually makes up the bulk of data currently sitting on servers. When hard drives got cheap, data volumes went up. Because we’re moving to the cloud it might not make sense to take all of the legacies forward with us. This might be a hard sell but consider leaving some of it behind.

You have a couple of options for this:

Archive the oldest files permanently onto external disks and file them away.
Archive the files you probably won’t need but can’t part with just yet into an Azure file store location or purchase additional SharePoint space and put them into an archive document library. Azure offers SMB shares to file storage locations. Since these are archived files you are thinking that you probably won’t need you can just map a few people to the SMB share. The cost of SMB file storage in Azure is pretty reasonable. It will cost you around $.10 per GB plus some small transaction costs but it will be quickly accessible and you can map a drive to it which is incredibly convenient.

Microsoft also offers Block Blob archive storage for $.002 per GB, but if you need to read it, be aware that you have to pull the entire blob out of the archive for around $26 for the operation and it will take a number of hours before it begins (as many as 15 hours). If you truly just want to store the data for the long term, $.002 is the least expensive way to do it.

The other option is to purchase additional space for SharePoint. This simply expands your data storage space in SharePoint and you can distribute it among sites and libraries however you like. But this is the most expensive option at $.20 per GB.

Start the discussion
Now that you know what the costs are going to be, it’s time to determine how much of your data is actually archive data. Azure says that archive data must not have been accessed for at least 180 days. But I’ll guess that most businesses have data that hasn’t been accessed for 180 days, one year, two years, or even five years. You’ve probably not looked at your data in this way before, but now it is the time.

Back in 2014, the Scripting Guy wrote up a simple script called Get-Neglected files that uses PowerShell to gather a list of files that haven’t been written to since period of time that you define. I recommend using this method. He uses the file property LastWriteTime to determine when the last time the file changed which is exactly what you’ll be after when determining which of your files are truly archive material.

Use the | to export the data into a CSV file where you can then calculate the amount of drive space that you’re going to need for your archive and for your working data.

Reevaluating folder depth for cloud compatibility
Now for the hard part. Deep complex folder/file structures don’t work very well in the cloud. The website rule of thumb that people won’t click more than twice to get to something very nearly applies to cloud-stored files too. Yes, it’s a whole new world. Yes, this means changing the way that businesses think of their data. The benefit of this exercise is that it is also going to expose teams that you didn’t realize existed in your organization, even though they don’t think of themselves in that way. As you go through and look at the folders, see who has access to them and who is actually using them to discuss whether the structure needs to be as deep and complex as it is. You’re going to find that most folders are accessed by just a few people and those people are entering the folder structure at different points to avoid the click, click, click, click drill-down process. We want to expose those points as they are logical places to break the chain. Further, you’re also going to expose areas of your folder structure that are only used by one person. Those should be moved into their OneDrive for Business personal storage location.

More motivation for simple folder structures
The limitation that my clients have the toughest problem staying under is the 400-character file path limit. Remember that your file path isn’t just the folder depth but that it also includes the SharePoint online URL too. For many businesses, this will mean a reevaluation of their folder structure to make it suitable for cloud storage and will give you some leverage when talking to staff about reevaluating how they are storing and naming files.

This can be a painful process, but is it a bad thing? I don’t think so. Often the current folder structure was grown on-premises over a long period of time and as Microsoft Office began to support longer and longer character limits file and folder names got longer too. The information explosion has also caused workers to give files descriptive names which are also longer. So we have some sacred cows to deal with during this migration. You are going to get a lot of pushback and unwillingness to sit down and hash through this process. But the end result will be worth it. A shorter pathed flatter file/folder structure is much easier to navigate on mobile devices. Since your cloud files will end up being viewed not only in OneDrive but also Microsoft Teams, SharePoint and other Office 365 applications, people will find that a flatter structure benefits everyone.

Migrating the data
Microsoft has produced a great migration tool for getting your data from on-premises and into SharePoint. You can read about it here. It allows you to pick a folder from your server and populate it into the SharePoint library of your choice.
It is very interesting to note that Microsoft recommends standing up several virtual machines to support the data transfer process. This will let you get multiple upload streams going at once. Take note, too, of the upload speeds. This is probably not something that you’re going to accomplish over a single weekend.

Type of metadata              Exa                                                      Average customer experience

Light            ISO files, video files                                                2 TB/day
Medium      List items, Office files                                            (~1.5MB) 1 TB/day
Heavy          List items with custom columns, small files     (~50kb) 250 GB /day
Data storage bottom line: Think it through before you go

The actual data move is going to be the least of your problems. In this case, the real work is all in the data preparation and getting the business truly ready for a move into the cloud. It’s your skill at consulting and working through internal politics that is going to make or break this project. Microsoft has turned data storage thinking on its head by providing huge personal storage and small corporate storage with their plans. If you want to make that work and utilize the included storage, then you’ll have some work to do.



About Third Tier

Open a ticket with us! Established in 2008, Third Tier only works for IT Professionals by providing them with access to advanced support services. No one can know it all these days, so we give IT pros a place to go to get the hands on support they need in areas they normally don’t work in or problems they’ve never encountered. We also work on projects, fix their accounting practices and do many, many migrations and other installations. Our staff covers a wide range of technologies.




Leave a comment

Your email address will not be published. Required fields are marked *