28
M i Fil St Managing File Storage Presented by Tony Asaro Senior Analyst and Founder, INI Group

Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

M i Fil StManaging File Storage

Presented by Tony AsaroSenior Analyst and Founder, INI Group

Page 2: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Content Chaos• IT departments have no controls on file

creation and storagecreation and storage

• There are no quotas

• IT has no authority to delete data

• 70-80% of all data is unused 90 days after70 80% of all data is unused 90 days after its creation

C 10 f 100 f• Companies have 10s of TBs, 100s of TBs, PBs and even 10s of PBs

Page 3: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“We literally have no visibility into what is b i d W k i lbeing stored. We know we are storing a lot of files not being used, duplicate data and personal files.”

Page 4: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“Last year we had just over 4 PB of file d i l h h 5 5storage and in less than a year we have 5.5

PB.”

Page 5: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

No Classification• All data is treated equally regardless of

importanceimportance

• There are tons of exact duplicates, content d li t d d li t d tduplicates and near duplicate data

• There are files that haven’t been accessed in over a year, two years, three years, etc.

Important data is hard to find what• Important data is hard to find – what documents have intellectual property; confidential information; evidence?

Page 6: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Scanned Images• There is still a ton of paper being used for

businessbusiness

• Applications, forms, contracts, etc.

• This information is scanned and becomes an “image”g

Page 7: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Stories From the Field

“We discovered we had 125 scanned i f k Chi dimages of a take-out Chinese menu stored on our Tier 1 storage.”

Page 8: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Backup and Recovery• Files systems range between 500 GB to 2

TB because of backup data setsTB because of backup data sets

• For 100 TB of file data that could mean 50-200 b k d t t200 backup data sets

• For 1 PB of file data it equates to 500-2000 qbackup data sets

Page 9: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field“I was told that we can’t buy more storage and that we have to make due with what weand that we have to make due with what we have. All we are doing right now is figuring out how we can optimize our environmentout how we can optimize our environment. I’ll be honest, we’ve had to play games to

d i dget more storage and put it under our services budget but we won’t be able to get away with that any more.”

Page 10: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“Power, cooling and floor space are huge i f Th i b d h I iissues for us. The cost is bad enough. It is three to four times my capital budget. But the bigger problem is I am running out of power and floor space in our data center.”power and floor space in our data center.

Page 11: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

NAS Challenges• There are only a handful of solutions• NAS migrations are complicated, costly,

li bl d ti iunreliable and time consuming• Block-level migration tools can transfer

f bl d t performance problems due to fragmentationNo data verification• No data verification

• Heterogeneous NAS migration are even more complicated and difficult more complicated and difficult

Page 12: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“We have 40 NAS heads managing 400 TB f d ”of data.”

Page 13: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“I feel like I am being held hostage by my NAS d ”NAS vendor.”

Page 14: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Voices From the Field

“My vendor quoted us one million dollars d f h i f h d dand four months to migrate a few hundred

TBs and about 30 NAS systems.”

Page 15: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

File Servers and SANs• There are a number of customers with

that use File Servers attached to their SANsSANs

• Creates more points of control and often harder to manage; more physical often harder to manage; more physical servers

• All of the same issues exist in terms of • All of the same issues exist in terms of visibility, optimization, protection, etc.

Page 16: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Unstructured Data= Useless Data• An abundance of documents have

valuable and vital information within them however there is no way to easily them however there is no way to easily access or extract it

• It often requires “us” to remember • It often requires “us” to remember what is important content

• Scanned in applications forms • Scanned in applications, forms, contracts, etc – are often essential to business but data is either manually business but data is either manually inputted or wasted

Page 17: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Why It Matters• Huge CapEx and OpEx impact

• More storage floor space power cooling• More storage, floor space, power, cooling

• Backup and data protection challenges

• Puts a huge strain on IT departments

• The problem will only get worse over time• The problem will only get worse over time as storage requirements continue to grow

• Unknown impact to the business

Page 18: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

What is needed?• Heterogeneous support for NAS, file servers and

CMS solutions• Requires flexible scalability• Requires flexible scalability• Discovery and analysis tools• Auto-classification• Auto classification• Optimization based on use, duplicates and near

duplicates• OCR engines• High performance, policy-based migration

technology that is transparent and provides data technology that is transparent and provides data validation with ongoing analytics and tiering

• Database integrationDatabase integration• Replication and recovery

Page 19: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Heterogeneous Support• Support any system that works with

NFS and CIFSl f• Support proprietary platforms via APIs

for CMS systems such as SharePoint and Documentumand Documentum

• The ability to analyze, optimize, search and migrate data between these and migrate data between these platforms transparently

Page 20: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Flexible Scalability• The platform needs to be able to scale

to 100s of TBS, PBs and even dozens of PBsof PBs.

• Requires clustering, scale-out or grid-technologies to add resources as technologies to add resources as needed

• An index will probably be needed and • An index will probably be needed and this must be capacity-efficient, fast and requires high availabilityand requires high availability

Page 21: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Discovery and Analysis• Wide range of metadata analysis

including last access datesd l d l• Discover duplicates, content duplicates

and near duplicatesd d• Users and document types

• Reporting tools to provide insight to k k d i imake key decisions

Page 22: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Auto-classification• Users• Document types• Last access dates• Keywords – “confidential and restricted”Keywords confidential and restricted• Pattern matching

Page 23: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Act on the Information• Delete or move duplicate data• Delete or move near duplicates• Delete of move infrequently accessed

data• Delete or move certain file types

Page 24: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

OCR Engine• Convert scanned images into searchable

PDFsbl k h “ l ”• Enables you to make this “useless”

content more usefulh b l• Requires the ability to convert

thousands to millions of files in a short windowwindow

• Analyze the scanned images for valuable contentcontent

Page 25: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

File Migrations• Heterogeneous• High performance with linear scalabilityg• Online and transparent• Scheduling, start/stop, throttlingScheduling, start/stop, throttling• Data validation using MD5, SHA1,

SHA256SHA256

Page 26: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Database Integration• Bridge the gap between structured and

unstructured datad l bl f h• Find valuable information within

unstructured content and import into database applicationsdatabase applications

• Enables the ability to leverage database applications running reports and queriesapplications running reports and queries

Page 27: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Replication and Recovery• Make replicated copies of data versus

backupff• Requires a cost effective Tier 3 storage

platformh d l f h d d d• Schedule scans of changed data and

update replicated copiesP id fil l l d i lli • Provide file level recovery and intelligent search and retrieval tools

Page 28: Presented by Tony Asaro Senior Analyst and Founder, INI Groupmedia.techtarget.com/.../TAsaro_1_Managing...FINAL.pdf · Content Chaos • IT departments have no controls on file creation

Summary• File storage is already a management

nightmare and the problem will only get worse unless acted uponget worse unless acted upon

• This issue will become a problem for the data center for the next decadethe data center for the next decade

• We need intelligence and management above the file system and storage above the file system and storage system level