60
For Alfresco 4.0 Enterprise Edition Alfresco Scalability Blueprint

Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Embed Size (px)

Citation preview

Page 1: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

For Alfresco 4.0 Enterprise Edition

Alfresco Scalability Blueprint

Page 2: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Copyright 2012 by Alfresco and others.

Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Alfresco. The trademarks, service marks, logos, or other intellectual property rights of Alfresco and others used in this documentation ("Trademarks") are the property of Alfresco and their respective owners. The furnishing of this document does not give you license to these patents, trademarks, copyrights, or other intellectual property except as expressly provided in any written agreement from Alfresco.

The United States export control laws and regulations, including the Export Administration Regulations of the U.S. Department of Commerce, and other applicable laws and regulations apply to this documentation which prohibit the export or re-export of content, products, services, and technology to certain countries and persons. You agree to comply with all export laws, regulations, and restrictions of the United States and any foreign agency or authority and assume sole responsibility for any such unauthorized exportation.

You may not use this documentation if you are a competitor of Alfresco, except with Alfresco's prior written consent. In addition, you may not use the documentation for purposes of evaluating its functionality or for any other competitive purposes.

This copyright applies to the current version of the licensed program.

ii

Page 3: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Document History

VERSION DATE AUTHOR DESCRIPTION OF CHANGE

0.1 2012-04-22 Gabriele Columbro Initial Table of Contents and first ideas gathering for review

0.2-DRAFT 2012-06-07 Gabriele Columbro Incorporated preliminary comments from Docs and Eng, and completed a full first version.

0.3-DRAFT-HM

2012-06-25 Helen Mullally Copyedit

1.0.4-DRAFT-RF

2012-06-26 Rui Fernandes Tech Review

0.5 2012-06-27 Derek Hulley Tech Review to pg 25

0.6 2012-06-29 Andy Hunt Tech Review to pg 29

0.7 2012-06-29 Briana Wherry Merged Comments and accepted editing comments.

iii

Page 4: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Table of ContentsINTRODUCTION...................................................................................................5SYNOPSIS...............................................................................................................5SCOPE....................................................................................................................5INTENDED AUDIENCE................................................................................................5

Executives.....................................................................................................................................................................5Advanced Alfresco 4 developers...................................................................................................................................5

READERS’

INTRODUCTION TO ALFRESCO ECM SCALABILITY........................................9ALFRESCO ECM SOLUTIONS....................................................................................9

Alfresco core ECM Solutions........................................................................................................................................9Alfresco extended ECM Solutions...............................................................................................................................10

ALFRESCO ECM SOLUTIONS SCALABILITY..............................................................11ECM SOLUTIONS SCALABILITY FACTORS.................................................................12

Performance...............................................................................................................................................................12Load distribution..........................................................................................................................................................13Availability...................................................................................................................................................................14Out of scope performance scalability factors...............................................................................................................15

ALFRESCO 3 ARCHITECTURES SCALABILITY GOTCHAS..............................................15In process (or in transaction) content and metadata indexing.....................................................................................16Lucene index tracked per node and Ehcache replicated independently......................................................................17ACL post query permission checking..........................................................................................................................18Non-HTTP based Virtual File systems not available in HA mode................................................................................20

ALFRESCO 4.X NEW SCALABILITY FRONTIERS..........................................................20Apache Solr indexing tier............................................................................................................................................20Clustered file system interfaces..................................................................................................................................26Alfresco Transformation Server...................................................................................................................................26

A NOTE ON THE ALFRESCO CLOUD SERVICE...........................................................26Transparent multi-tenancy..........................................................................................................................................27Index sharding............................................................................................................................................................28

THE ALFRESCO 4 BENCHMARKS....................................................................29BENCHMARK TYPES...............................................................................................29

Scalability Benchmarks...............................................................................................................................................29Comparative Benchmarks...........................................................................................................................................30System boundary discovery benchmarks....................................................................................................................30

iv

Page 5: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

IntroductionThis chapter introduces the scope of this document, the intended audience, and the required level of Alfresco knowledge. It also includes a list of documents for recommended reading.

SynopsisThis document presents the results of the Alfresco Enterprise benchmarks and uses them to analyze Alfresco 4 performance and scalability in an Enterprise collaboration scenario. Use this document as a reference for practical recommendations and best practices for successful sizing, architecture, and deployment of large-scale Alfresco solutions.

ScopeThis document applies to Alfresco Enterprise 4.0 (and above), which is the target platform on which the first benchmarks were run. Benchmarks are a continuously evolving effort so further benchmarks might disprove or update the results and considerations presented in this document. Therefore, this is considered to be a “living document” and is generally intended for the whole Alfresco 4.x series. Specific tuning best practices might be superseded or obsolete once new benchmarks are run on future versions.

Intended audienceThis document is intended for the Alfresco Enterprise customer and partner network with special focus on the most technical teams, such as Enterprise Architecture, Development, Support, and Operations. As it requires a deep understanding of the architecture, components, and technologies involved in the operations of the Alfresco platform, the ideal reader should hold Alfresco Certified Engineer (ACE1) or Alfresco Certified Administrator (ACA2) certification.

ExecutivesAlthough this document is intended for a technical audience, the benchmarks results may also be of interest to a non-technical audience. Two sections are suitable for this audience:

Error: Reference source not found - presents the benchmarks results in the context of a real-life Enterprise deployment scenario

Error: Reference source not found - provides general conclusions and statements that are applicable to the benchmarks results

Advanced Alfresco 4 developersAdvanced developers who are familiar and have hands-on experience with Alfresco 4 may wish to skip directly to the chapters The Alfresco 4 Benchmarks and Error: Reference source not found. These chapters provide a quantitative analysis of Alfresco 4 performance and scalability based on the benchmarks results. For the complete reference of Alfresco and underlying components configuration, refer to the Error: Reference source not found appendix.

1 http://university.alfresco.com/ACE.html2 http://university.alfresco.com/ACA.html

5

Kathryn MacLean, 07/14/12,
There is a problem with the first link: text in this paragraph does not match the title in the document.
Kathryn MacLean, 07/14/12,
Again, reference doesn’t match title; selecting “Update field” doesn’t correct it.
Kathryn MacLean, 07/14/12,
Reference doesn’t match the title.
Page 6: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Readers’ prerequisitesIn addition to being ACA/ACE certified, readers should have a full understanding of the following technical areas:

Alfresco deployment stacks

Alfresco 4 architecture and components, especially Solr

Alfresco clustering and high availability

General hardware/infrastructure/network performance concepts

Load and performance testing

Alfresco 3.x scalability concepts and Lucene integration

The following documentation is recommended reading:

Scale your Alfresco Solution3 available on the Alfresco Support site - offers a general introduction to architectures, design, and tuning of Alfresco highly-scalable solutions. Since it is based on Alfresco 3.2, some concepts are superseded by this benchmark document, such as the introduction of a separate indexing tier with Apache Solr in Alfresco 4. However, it still represents a valid reference and qualitative introduction to the concepts presented in this document.

Alfresco Online Documentation – see the sections on High availability4 and Solr integration5

Alfresco DevCon developer presentations available in Slideshare6 - in particular, see the presentations on Solr integration7 and scalability8

Alfresco training9 or a dedicated Alfresco Consultancy10 developer enablement package

For additional questions and enablement on Alfresco scalability, contact Alfresco Support or your local Solution Engineer if you are an Alfresco Customer, or refer to the Alfresco Partner Team11 if you are an Alfresco Partner.

AssumptionsThe benchmarks and the scalability analysis results presented in this document are based on a number of assumptions: ranging from technical to a process/scenario, in order to drill down the complexity of the benchmark exercise and trying to match the expectations of the widest audience within the Alfresco Enterprise network.

The following list outlines the high-level assumptions:

3 http://support.alfresco.com/ics/support/DLRedirect.asp?fileID=18158 (login required)4 http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/ha-intro.html5 http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/solr-intro.html6 http://www.slideshare.net/alfresco7 http://www.slideshare.net/alfresco/understanding-the-solr-integration8 http://www.slideshare.net/alfresco/performance-and-scalability-101617039 http://university.alfresco.com10 http://www.alfresco.com/services/consulting/11 https://partners.alfresco.com/

6

Helen Mullally, 07/14/12,
This part of the sentence does not make sense with the first part. Suggest that you start a new sentence saying “The assumptions drill down the complexity of the benchmark exercise and try to match the expectations of the widest audience within the Alfresco Enterprise network.
Kathryn MacLean, 07/14/12,
Not sure what is being said after this colon.
Page 7: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Alfresco 4.0.0 Enterprise Edition was used, leveraging Apache Solr for the indexing tier. However, the focus of the benchmark was on the Alfresco repository, so Solr physical scaling is treated as a black box for the rest of this document.

Alfresco and Share usage as a large scale Enterprise Collaboration platform is the main scenario/use case on which to proof the platform, especially in a search intensive use case.

The Alfresco Out of the Box content model and functional configurations were used.

A pre-defined, reproducible, incremental approach was used to run the benchmark to ensure that a like-for-like comparison between results is as precise as possible. Nevertheless, certain configurations have been changed during the benchmark while exploring larger data points, as product configuration bottlenecks were discovered. See the Error: Reference source not found for a detailed report on the configurations used.

For specifics of the technical implementation, refer to the following sections.

Out of scopeBased on the process, software, hardware, scenario, assumptions, and resources used to run the benchmarks, the results and, therefore, the analysis presented on this document are NOT applicable to:

Alfresco Community and Cloud editions

Alfresco versions other than 4.0.0; it is expected that 4.0.x maintenance releases will provide improvements on the metrics presented in this document, also based on the product configuration and bottlenecks studied in the benchmark

Highly customized instances of Alfresco, either in terms of content model, configuration, or custom developed functionalities

Different components of the stack (for example, JVM, DBMS, Application Server, Operative System) and their respective configuration

Different scenarios and APIs used, other than the Enterprise Collaboration Platform scenario that was reproduced for the benchmarks, as described in Alfresco 4 Benchmark details

Notice to readersBenchmarking is a formal and highly technical process for the optimal configuration and for reproducibility between different tests. Metrics collected from the benchmark are the result of a very fragile balance between stack components’ configurations, client and server resources, and the exact scenario used for testing.

Alfresco sizing, configuration, and tuning, as for any cross-stack vertical platform, are dependent on the type of operations executed against the platform, their frequency, and the APIs used to perform those operations.

In this sense, the results presented in this document should always be referred to as a sizing and benchmarking model, rather than as blind performance requirements metrics.

Most importantly:

7

Kathryn MacLean, 07/14/12,
Awkward. Not sure what you’re trying to say here, which makes it difficult for me to accurately rework the sentence.
Kathryn MacLean, 07/14/12,
The word "also" seems out of place. Generally, I think the highlighted part of the sentence can be deleted.
Kathryn MacLean, 07/14/12,
Should this be lowercase? I believe that is the standard we are using – please confirm.
Kathryn MacLean, 07/14/12,
Avoid this wording as it is too open-ended. It is unclear which sections are being referred to. I would remove this sentence completely as I don’t think it’s necessary.
Page 8: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Do not apply the benchmarks results and scalability analysis “as is” to your context. Instead, use them as a relative measurement to estimate the potential load requirements of your application. More simply, use them for what they are: results from an official benchmark to be used as an informative reference and source of inspiration.

Acronyms and symbolsThe following acronyms and symbols are used throughout the document:

ACRONYM/SYMBOL DESCRIPTION

ACE Alfresco Certified Engineer

ACA Alfresco Certified Administrator

BFSIT Bulk File System Import Tool12

TXN Transaction

DBMS DataBase Management System

HA High Availability (Available)

OOTB Out of the Box

ECM Enterprise Content Management

DR Disaster Recovery

I&AM Identity & Access Management

12 http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/Bulk-Import-Tool.html?resultof=%22%62%75%6c%6b%22%20%22%6c%6f%61%64%22%20

8

Page 9: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Introduction to Alfresco ECM ScalabilityThis chapter defines the concepts of performance and scalability in the context of the typical ECM solutions and scenarios implemented on Alfresco Enterprise, discusses scalability limits of Alfresco 3, and provides an introduction to the Alfresco Enterprise 4 new architectural features that respond to high-scale content requirements.

Alfresco ECM SolutionsThanks to its modern and flexible design, as well as to its modular architecture, Alfresco Enterprise supports, and has been successfully deployed to, a vast range of vertical or horizontal solutions.

Alfresco core ECM SolutionsFigure 1 Alfresco Core ECM Solutions shows a common classification of the core landscape of typical Alfresco-based solutions:

Figure 1 Alfresco Core ECM Solutions

Systems of Record (or Headless Content Management Platform)

o Typically characterized by the use of Alfresco as a pure Content Server, as a back-end system to store massive archival or controlled document management systems; optionally, uses the Records Management module

o Typical Features: Massive migration or batch injection (10M+), ID or search based retrieval of content, integration using remote APIs/CMIS, simple content model, batch mostly write operations, users mostly read operations

o Although some support information will be provided, these use cases are not the main focus of the benchmarks covered in this document.

9

Page 10: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Systems of Engagement (or Enterprise Collaboration Platform)

o Characterized by the use of Alfresco – and Alfresco Share – as an Enterprise Collaboration platform, leveraging all the potential of the Share interface to offer project/team/department-based collaboration spaces to the extended Enterprise

o Typical Features: High interactive user concurrency, especially during working hours, strong reliance on search, use of the full UI functionalities of Share, users performing mostly read and search operations but also writes, global deployments

o This represents the main use case/scenario/solution that was tested in the benchmarks presented in this document; therefore, these results typically apply better to the deployment of this type of solution.

Web Content Services (WCS Solutions)

o Typically characterized by the use of Alfresco as an editorial and publishing platform: Alfresco is used as a back-end system to support the Enterprise Web Content delivery process, internally or across the firewall, either directly or via remote content deployment

o Typical Features: Use of Workflow, Rendition APIs, Replication or Transfer Service, CMIS, Web Quick Start, from small to large write interaction, depending on the editorial concurrency, mostly read on the front-end (if directly connected to Alfresco)

o Although some support information might be found in this benchmark, these use cases are not the main focus of the benchmarks covered in this document.

For a full description of this classification with more detailed examples, refer to the “Scale your Alfresco Solutions” document.

Alfresco extended ECM Solutions

deFigure 2 Alfresco extended ECM Solutions landscape

The capabilities of the Alfresco platform are not limited to the three main use cases already presented. Alfresco is increasingly being used to support use cases, such as:

10

Kathryn MacLean, 07/14/12,
Inconsistent formatting with what we have seen so far. Why is this not hyperlinked?
Page 11: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Social Content Management and integration with social networks. This can be done via the Alfresco Social Publishing APIs or via supported integrations like the Alfresco/Jive integration13.

• Records Management and Archival based on the US DoD 5015.2 standard compliant module. This will soon be released in its new version (2.0) in order to simplify customization and extensibility to support compliance to other national/regional/particular standards and specifications.

• Business Intelligence to perform Content and/or Workflow OLAP analysis. In the latter case the BPMN 2.0 compliant Activiti14 platform can be leveraged as it is seamlessly integrated in Alfresco.

• Cloud connected content platform to manage content leveraging the cloud infrastructure, either with Alfresco in the cloud15 (see section A note on the Alfresco Cloud) or with custom private cloud Enterprise deployments. Alfresco is active in this area and private <-> public cloud synchronization features are expected in the product in the near future to enable even more distributed cloud deployments.

As is clearly shown from the wide variety of solutions presented, in which Alfresco plays a central role, the interaction pattern with the platform is highly dependent on the particular Alfresco scenario. It can vary considerably in terms of user concurrency and operation type, typical size of the repository, integration API or user interface in use. That typically translates into stressing potentially different internal components of Alfresco – or underlying component of the infrastructure – and therefore in potentially very different architecture, design and tuning requirements.

Alfresco ECM Solutions ScalabilityWhile obviously general performance tuning concepts apply, as previously explained, Alfresco ECM supports such a varied mix of use cases, processes, and APIs that scalability of the Alfresco platform is very strictly correlated with the usage scenario.

Quoting an interesting blog post16:

“ECM systems can be scalable or they can fail to scale well. They can have modular architectures that allow you to simply add more elements as required, rather than multiply the entire system as things expand. They can be scalable in that they have built in high availability, automatic failover support, run on enterprise grade application servers and databases. They can be scalable because they have been tested and proven to handle very high volumes (hundreds of millions of documents) in the repository and/or tested and proven to handle very high throughput rates (tens of thousands per hour or minute). There are many ways in which an ECM system can scale or not. But the biggest element determining whether the system can scale is your usage of it”As ECM semantics grow alongside the content explosion we are experiencing with the advent of social networks, collaboration platforms, and content mobility increasingly penetrating the Enterprise, it is fundamental to have a solution-oriented

13 http://www.alfresco.com/products/integrations/jive/14 http://activiti.org/15 https://cloud.alfresco.com16 http://www.realstorygroup.com/Blog/1403-Scalable-ECM

11

Helen Mullally, 07/14/12,
This typically means that for each solution, there will be different internal components used, or underlying components of the infrastructure, and therefore they will have potentially very different architecture, design, and tuning requirements.
Kathryn MacLean, 07/14/12,
Sentence meaning is unclear and flow is awkward. I suggest breaking this into two sentences. This last section needs a stronger tie-in with the first thought.
Page 12: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

approach when trying to define your specific scalability requirements. In other words, there is no such concept as ECM Scalability per se unless that is put in a specific deployment context and usage scenario, such as the one that can be re-created in a benchmark lab.

Therefore, for the remainder of this document, the word “scalability” refers to the scalability of the Alfresco platform for an Enterprise collaboration scenario that is typical of systems of engagement, rather than referring to platform scalability in general.

With this in mind, we introduce the common factors typically required in a large-scale Alfresco ECM solution to provide benchmark results based on the considerations covered in the following chapters.

ECM solutions scalability factorsAlthough ECM scalability is often a fairly linearly defined concept, it is heavily solution-dependent. Some areas of scalability may affect the way Alfresco is deployed and architected, and should be taken into account when designing your solution. The following section provides a suggested classification of the scalability factors, and also identifies whether or not they are covered in this document.

PerformancePerformance is the most straightforward factor around scalability, as it is tightly related to the end user experience. In this sense, performance is typically one of the most stringent and discussed customer requirements. Clearly defining the required performance characteristics before the implementation of the project will help establish an estimate of the load that the Alfresco platform will be required to support. Performance is typically measured in Alfresco response time to user or batch operations: a simple approximated but still statistically valid way of defining performance of an ECM platform is the general average response time of all user operations. Throughput is another common measurement.

The type of user operations that compose the test scenario is fundamental, and, since they might hit different underlying components, it is also interesting to measure the average response time or throughput for a specific operation type (for example, read, write, or search).

Performance snapshots (measurements taken during one test run) are of interest for the benchmark; however, in the context of scalability, performance is typically measured across dimensions, evaluating the impact on performance caused by changes to these dimensions. For the scope of this document, using the Enterprise collaboration scenario, a possible definition of performance requirements could be:

Baseline performance:

Alfresco should support an average response time below t0

milliseconds for a baseline u0 concurrent users on a reference repository of c0 content items.

Performance scalability on users dimension:

12

Helen Mullally, 07/14/12,
Performance scalability on user’s/users’/user dimension.
Kathryn MacLean, 07/14/12,
Formatting – don’t overuse bold.
Kathryn MacLean, 07/14/12,
Unclear what “they” refers to: is it the “type of user operations” or “the test scenario”. In both cases, the subject is singular so the pronoun (they) should not be plural.
Kathryn MacLean, 07/14/12,
I don’t think this is the adjective you want to use. Suggested wording: “Performance is an important factor in scalability, as it affects the end user experience. This means performance is typically…”
Kathryn MacLean, 07/14/12,
Unclear what “the one” refers to.
Kathryn MacLean, 07/14/12,
Unclear what “that” refers to; sentence loses meaning.
Page 13: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Alfresco should support an average response time degradation tΔ u below a tΔ uMax threshold with respect to the baseline upon a

u growth on the users Δ dimension.

Performance scalability on content dimension:

Alfresco should support an average response time degradation tΔ c below a tΔ cMax threshold percentage with respect to the

baseline upon a c growth on the content dimension.Δ

Combined users/content dimensions comparative performance analysis is also possible, but it involves additional caveats and complexity. We discuss this in the following chapters, providing more specific definitions of the exact dimensions, data points, and thresholds used to prove Alfresco scalability.

Performance scalability across the aforementioned dimensions is the main set of results and metrics analyzed in the benchmarks and presented in this document.

Load distributionAnother fundamental aspect to take into account while defining a scalable architecture is how well incoming workload is distributed between the different components of the architecture, in order to provide a balanced framework that can be designed to scale arbitrarily.

While concepts like load balancing and distribution are even more fundamental in cloud deployments1718, they represent a key factor of a successful on-premise Alfresco ECM implementation. There are two fundamental reasons to leverage a modular platform that can distribute the load:

As explained, load requirements on Alfresco deployments might vary considerably with increased user adoption or with an increase in the number of ECM processes based on it. A platform that is flexible and can reactively redistribute load will be able to cope with these variations with minimal impact. Cost optimizations can be identified by a better spread of Alfresco load across tiers and cluster instances instead of having to rely on pure vertical scaling only to increase load requirements on your Alfresco solution.

Alfresco has always offered a great deal of flexibility and modularity on architectural deployments, ranging from supporting different stack components to remote independent logical tiers able to spread the overall platform load across multiple physical machines.

Together with performance, distribution of the load upon growth across the different dimensions will be discussed in the next chapters, based on the benchmark results.

AvailabilityAvailability is another factor determining the scalability of an ECM system. Alfresco offers core product answers to typical availability requirements, such as:

17 https://devcentral.f5.com/weblogs/macvittie/archive/2009/01/23/load-balancing-is-key-to-successful-cloud-based-dynamic-architectures.aspx18 http://gojko.net/2010/01/25/designing-applications-for-cloud-deployment/

13

Kathryn MacLean, 07/14/12,
Vague
Kathryn MacLean, 07/14/12,
Suggest a different word: basic, main, important?
Kathryn MacLean, 07/14/12,
If you are introducing two reasons, there should be two bullet points.
Kathryn MacLean, 07/14/12,
Vague; doesn’t give the reader any useful direction.
Kathryn MacLean, 07/14/12,
Same as previous comment: should this be user’s, users’, or user?
Kathryn MacLean, 07/14/12,
Should this be “threshold percentage” as in the next box?
Page 14: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

High availability: The system should be functional and running for X% of the time (where typically X is close to 100%). This is especially important in global deployments where the system should be up and running 24*7, or in projects characterized by frequent release cycles for which no maintenance windows are provided.

Avoiding single points of failure: This is a common requirement when building resilient architectures, which can support mission critical processes. As explained below, Alfresco components offer horizontal scalability features to respond to this requirement at the application level and integrate with lower layers of horizontal scaling.

Hot Backup and Disaster recovery site: This is defined as the possibility to design architectures that allow batch data backup operations and – mission critical cases that have not considered actively to avoid single points of failure - keep a passive remotely-synchronized instance without any noticeable degradation in the application serviceability. Alfresco supports hot backup19 and advanced DR architectures.

Alfresco, since version 1.4, has supported application level clustering to provide a strong degree of horizontal scalability to the application, along with the more intuitive concept of vertical scalability (providing more processing power to a maxed out system will definitely raise the throughput). In general, Alfresco responds to availability scalability requirements with the following features:

Application Level Clustering via JGroups/Ehcache and index tracking

Mostly transparent to Application Server clustering on supported application servers

Support HA and failover solutions on underlying components: for example, Oracle RAC (active/active) or MySQL cluster (active/passive) for the database, as well as snapshot enabled disks for the content store

Since Alfresco 3, Alfresco Share is the default remote, independently scalable Web Tier to support Enterprise wide ECM Platform deployments

With Alfresco 4 - and the introduction of a fully externalized indexing tier based on Apache Solr (Alfresco patched) - a new range of architectural options becomes available for deploying Alfresco in HA mode, avoiding single points of failure and transparently scaling the indexing tier in a dedicated manner.

We will discuss, especially in the "Alfresco 4.x new scalability frontiers” section, Alfresco 4.x new scalability frontiers and how Alfresco 4 can be leveraged to address availability requirements. However, availability is not the main focus of this scalability document even though the benchmark architecture has high availability features, such as Alfresco clustering.

A note on stability

Application stability (or availability) is clearly distinguished from the concept of performance and is, therefore, treated separately. An unstable system at core resources level will probably result in unsatisfactory performance and a poor user experience. The following issues are typical signs of a system that is overloaded,

19http://docs.alfresco.com/4.0/index.jsp?topic= %2Fcom.alfresco.enterprise.doc%2Ftasks%2Fbackup-hot.html

14

Kathryn MacLean, 07/14/12,
Can this section simply be titled “Stability”?
Kathryn MacLean, 07/14/12,
Duplication of text and hyperlinks. Suggestion: The section “Alfresco 4.x new scalability frontiers” discusses how you can leverage Alfresco 4 to address availability requirements.
Kathryn MacLean, 07/14/12,
HA = “high availability”? If you will use this regularly as an abbreviation, define it the first time you use it.
Kathryn MacLean, 07/14/12,
Don’t see why this is set aside in hyphens. If the hyphens are simply removed, though, this will be a run-on sentence. Needs work.
Kathryn MacLean, 07/14/12,
Give a more specific reference. Never use “above” or “below”.
Page 15: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

undersized, or not properly tuned, and likely to present functional leaks or bottlenecks:

System close to Out of Memory

Processes running out of CPU

OS running out of file handles or I/O bandwidth

Frequent crashes due to OS general failures

In these cases, the suggestion is to tackle stability/availability issues earlier than performance fine tuning, concentrating on reaching a predictable and stable system to use as a model for performance and scalability corrections on the original system.

Out of scope performance scalability factorsAt least two more performance dimensions are considered while studying Alfresco scalability in the context of an Enterprise Collaboration platform deployment:

Geographic Performance scalability defined as: The average response time or throughput of the application is independent of or negligibly impacted by the geographic location of the user. This is especially important for global deployments in which an Alfresco “logical content tier” can be deployed across multiple physical Alfresco instances (for example, using Content Transfer and Replication) services.

Vertical Performance scalability defined as: The average response time or throughput is positively impacted by scaling up resources in one or more components of the infrastructure (typically adding CPU or RAM in one of the tiers). Typically, this type of scalability is better studied in a virtual environment where resources can be added seamlessly to a virtual machine. However, virtual environments have their own specific limitations with regards to performance.

Unfortunately, these types of scalability dimensions are very difficult to reproduce and prove in a single benchmark lab environment. Because of this, these scalability dimensions were left out of the analysis and are out of scope for this document.

Alfresco 3 architectures scalability gotchasAs discussed in the section Availability, Alfresco 3 already offers radical answers to scalability, including clustering/load balancing of the Alfresco repository and Share tiers, as well as more sensible distribution of operations dependencies between database and indexes. This section refers to “Scale your Alfresco Solutions” for a comprehensive discussion on the main HA points.

There are a few areas where Alfresco 3 is presented some limitations to indefinite theoretical scalability of the solutions built on top of it. This is especially evident in systems that are not properly tuned or that are undersized.

15

Kathryn MacLean, 07/14/12,
Awkward/unclear. Suggest: “… where Alfresco 3 has limitations regarding indefinite theoretical scalability…”
Kathryn MacLean, 07/14/12,
What is this? Another document?
Page 16: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

To understand how Alfresco 4 overcomes these potential bottlenecks, it is important to provide a high-level view of the Alfresco 3 shortcomings around scalability. As a reference for a typical Alfresco 3 multi-layered large-scale architecture, we will use the architecture presented in Figure 3 Alfresco 3 typical high scale architecture. We will not focus on the details of this architecture, as the focus of this document is on Alfresco 4 architectures.

Figure 3 Alfresco 3 typical high scale architecture

In process (or in transaction) content and metadata indexingIn Alfresco 3 by default, all metadata (and possibly some content) is indexed during the create/update transaction (“In-Transaction Indexing”).

Figure 4 Alfresco 3 - Internal Node Creation process

This model guarantees that the database and search indexes are kept in synch, which in turn allows developers to reliably use database- and search-based APIs side by side. For example, the NodeService API is database-based, while the

16

Kathryn MacLean, 07/14/12,
“synch” or “sync”?
Page 17: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

SearchService API relies on the Lucene indexes. However, this model poses some important questions about system scalability:

Performance:

o Performance might be adversely impacted by growth in the user concurrency dimension, especially when performing write operations.

o The in-transaction indexing model involves I/O pressure and contention on the index which is proportional to the number of concurrent users, because of the synchronous attempts to access the low-level file system indexing segments.

o Search and write operations contend the same index in a transactional fashion. This can create performance bottlenecks.

Load distribution: o Apart from the normal content repository operations, Alfresco uses

resources and transaction time updating Lucene indexes for every write transaction. This impacts the read and search requests that are being issued against the server.

o Since there is a very small separation between the different concerns of content storage and indexing, the load is completely concentrated on the Alfresco repository.

Lucene index tracked per node and Ehcache replicated independentlyAll Alfresco HA clustering models require the two following processes to be carried out independently by the Alfresco repository tiers (see Figure 5 Alfresco 3 Shared Repository Cluster):

Pull Lucene Index Tracking: Each Alfresco instance keeps its own copy of the Lucene indexes and the Index sync (or tracking) process has to be enabled in every cluster node to keep the indexes up to date with the transactions added by the different nodes of the cluster in the shared database.

Push Ehcache replication: Each Alfresco instance will broadcast L2 cache invalidation and update messages to the other members of the cluster upon completion of a transaction (see Figure 4 Alfresco 3 - Internal Node Creation process) to update them on the transaction updates required to prevent stale cache entries reads. This process runs on a frequent schedule and keeps the cluster node caches in sync.

This approach has the potential for consistency issues between the DB, the caches, and the indexes.

17

Kathryn MacLean, 07/14/12,
Capitalization?
Kathryn MacLean, 07/14/12,
Why italicized?
Page 18: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Figure 5 Alfresco 3 Shared Repository Cluster

For the purpose of this document, the following scalability limiting factors can be identified in this approach:

Performance:o Index maintenance (for example, Index Merging20) processes could

have an impact on performance, especially during concurrent activity peaks.

o PATH queries – especially using //* wildcards – are very slow and resource intensive, limiting the use of hierarchical queries for high concurrent usage.

Load distribution:o Index tracking and maintenance work has to be carried out by each

and every Alfresco front-end node. Each node also has to cope with front-end user interactions and there is no way of scaling out this set of operations (scaling up is the most common solution).

o The index related load (tracking and maintenance) is proportional to the number of nodes in the cluster, with increased access to the DB and content stores.

ACL post query permission checking Search is extensively used when Alfresco is deployed as a collaboration platform. In large global deployments, corporate I&AM systems are integrated with Alfresco, synchronizing large numbers of users and groups, mapping them in an arbitrarily complex matrix of hierarchical permissions on Alfresco spaces and sites. In these contexts, one of the most stressed Alfresco components is the one delegated to filter

20 http://wiki.alfresco.com/wiki/Index_Merging_Performance

18

Page 19: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

search results according to the current user’s permissions: that is, evaluation of hierarchical ACLs against search results.

As shown in Figure 6 Alfresco 3 - Internal query process, permission checking is carried out after Lucene has generated the result set for the search.

Figure 6 Alfresco 3 - Internal query process

This model involves an additional iteration across all rows of the result set after selection, a process which, being resource and time intensive, is controlled by the following timeout parameters:system.acl.maxPermissionChecks

system.acl.maxPermissionCheckTimeMillis

These two parameters respectively limit the number of results that undergo permission checking and limit the total time that is spent on the permission-checking phase. Therefore, it is possible that one search will not return all the results visible for the user if permission-checking timeouts are hit. Also, permission-checking results are cached, so successive execution of the same search might return an increased number of results that could be permission checked. This type of effect is typically seen when the repository content nodes are in the order of millions.

Putting aside for a moment the pure consistency implications, from a scalability standpoint, we can identify potential bottlenecks in:

Performanceo Raising the system.acl.* properties might overcome the

consistency implications, allowing more time to check permissions; this could return a constant number of results. On the other hand, performance is traded for consistency: the performance cost increases as the repository grows in size and the ACLs grow in complexity.

Load distributiono The search ACL checking load is an additional responsibility

consolidated in the repository, inherently limiting the chances for scaling out such a complex operation.

19

Helen Mullally, 07/14/12,
This title should be “What happens when you query for nodes?”
andy, 07/14/12,
Change text within the graphic to:What happens when you *query* for nodes?
Page 20: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Non-HTTP based Virtual File systems not available in HA modeWhile all the Alfresco components can be deployed in HA mode, with respect to the several UIs provided to end users, in Alfresco 3 there was a limitation on scaling virtual file systems horizontally. Virtual file systems are a powerful Alfresco feature that allow you to mount the Alfresco repository as a WebDav, FTP, CIFS, or NFS shared folder.

Whereas WebDav (HTTP based and relying on standard Alfresco servlets, caching, and clustering rules) has always been clusterable, FTP, CIFS, and NFS virtual file system interfaces could not be clustered in Alfresco 3: this was mainly due to the presence of additional caching layers and limitations in handling intra-cluster locking operations.

There are obviously implications on the degree of scalability that could be reached:

Performanceo Performance of virtual file systems could be adversely impacted by

high concurrent load.

Load distributiono In the case of high usage of the virtual file systems, it is easy to

overload a clustered production installation. All the HTTP traffic can be load balanced while all the virtual file server traffic must be directed to a single node.

o HA architectures using this approach are by definition unbalanced clusters, as all virtual file server traffic is redirected to only one member of the cluster.

Availabilityo Virtual file servers become a single point of failure.

Alfresco 4.x new scalability frontiersAlfresco 4 introduced a few key design and architecture innovations that were targeted at simplifying the deployment of large-scale solutions. Since these new deployment options are key to understanding the architecture of the benchmark, they are highlighted in this section.

In particular we will see, with respect to the scalability limitations presented in the Alfresco 3 architectures scalability gotchas, how Alfresco 4 addresses those and offers new frontiers towards even larger scale deployments.

Apache Solr indexing tierFrom an architecture and scalability standpoint, the main innovation in Alfresco 4 is the definition of an external, independent indexing tier implemented by integrating the mature enterprise search engine Apache Solr21.

Index subsystem design

Solr is itself based on a more recent version of Lucene, but exposes a remotely addressable REST API to allow remote querying, update, and administration of the underlying Lucene indexes. Solr itself supports large scale enterprise search intensive deployments and this brings immediate scalability benefits to the Alfresco indexing components, whose responsibilities can now be fully delegated to an

21 http://lucene.apache.org/solr/

20

Page 21: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

independent (possibly remote) subsystem like Solr. Accordingly, we can identify the design/architecture innovation in Alfresco 4 as being:

The full decoupling of Alfresco core and UI functionality from an embedded indexing system, encapsulating indexing features in a

fully independent, and independently scalable, indexing tier based on Solr

In Alfresco terms: indexing and search capabilities have been refactored into a cohesive software module – the ‘Index subsystem22 - which is now clearly focused on providing search functions to Alfresco users and client applications. Alfresco 4 does not depend on an indexing and search engine for a correct behavior of most of its service. Alfresco now relies on the DBMS to ensure the essential functionality of the different UIs.

In addition, the allowed values for the property index.subsystem.name are:

lucene: Alfresco will continue to use an instance of Lucene embedded in Alfresco, so all the considerations around Alfresco 3 apply.

solr: Enables the Solr integration.

noindex: Configure Alfresco not to use any indexing system. In this case, obviously, search requests from Alfresco Explorer, Share, or HTTP APIs will not return any results.

The introduction of an independent subsystem in the scalability context offers immediate advantages like:

Distribution of core repository and indexing loads on separate tiers: This can potentially enable very distributed deployments and scalable architectures, as we will see, by using load balancing and caching techniques.

Independent vertical and horizontal scaling of the repository and the index tier

Separation of the front-end query load from the indexing load: This will become clearer in the next section, when we will discuss the concept of eventual indexing (eventual consistency).

This new approach to indexing involves a number of deep changes in the way Alfresco manages the lifecycle of the indexes and interacts with the index subsystem. This refactoring and its effects are key to understanding how the load is distributed between tiers in Alfresco 4 and, therefore, how that impacts the independent scaling of the different tiers. We provide an introduction to these key scalability facts in the following sections.

Transactional consistency vs. eventual consistency

Maintaining a transactional indexing model (like the Alfresco 3 model, which introduced In process (or in transaction) content and metadata indexing) in conjunction with an external indexing tier (involving distributed transactions and network latency) would have posed strong risks to the stability of the platform. Therefore, in order to overcome the scalability limitations of in-transaction indexing, Alfresco 4 introduces the concept of eventual consistency.

22 http://wiki.alfresco.com/wiki/Alfresco_And_SOLR#Configuring_Alfresco

21

Kathryn MacLean, 07/14/12,
Vague
Kathryn MacLean, 07/14/12,
Something has gone wrong here with multiple edits. Should this be "or HTTP API clients"?
Page 22: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Figure 7 Alfresco 4 - Node creation internal process

In Figure 7 Alfresco 4 - Node creation internal process, we show how the node creation process has been refactored in Alfresco 4 not to include any in a transactional indexing operation. Therefore, the Alfresco repository can focus on performing core content management operations delegating the indexing duties to a pull index tracking scheduled job running in the Solr tier23.

In other words, Alfresco 4 removes the requirement to have the database and indexes in perfect synch at any given time and relies on an index that gets updated on a configurable interval (default: 15s) by Solr itself. The index tracker will take care of polling Alfresco for new transactions and will proceed to update its index, pretty much like an Alfresco 3 cluster node will track updates in its own copy of the Lucene indexes. In this sense, indexes will eventually be consistent with the database. Alfresco implementers using the Search API24 should be aware of the following:

Implementers should assume that the indexes will not be consistent with the database at any given time, since the index

update delay might depend on the actual load on the repository/index tiers.

Note that Alfresco 4 does not rely on the index for its core functionality. Wherever transactional consistency was required (authentication, Document Library, bootstrap, check-in/out, etc.), the use of the Search API in Alfresco 3 has been refactored to use database queries.

In-query ACL checks

Another key Alfresco 4 scalability redesign is to delegate the search query results permission checking to Solr, where the index subsystem is configured to use it (note that if the Lucene implementation is in use, Alfresco 3 limitations around ACL performance remain). Leveraging the in-built Solr concept of filter queries25, Alfresco also provides user and group membership information over the wire so that Solr, which holds the document permission information, can return only results that are actually visible for the currently logged in user.

23 http://wiki.alfresco.com/wiki/Alfresco_And_SOLR#Tracking24 https://wiki.alfresco.com/wiki/Search#The_Search_API25 http://wiki.apache.org/solr/CommonQueryParameters#fq

22

Kathryn MacLean, 07/14/12,
Is this correct?
Page 23: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Figure 8 Alfresco - Search query internal process provides a graphical representation of the refactored flow of operations:

Figure 8 Alfresco - Search query internal process

From a load distribution point of view, ACL checks are removed from the Alfresco repository tier and distributed to the index with important positive effects, as we will see, on the overall Alfresco user experience. In particular, search results will not be trimmed by any ACL-checking time constraints.

Solr needs to have knowledge of authorization information that is created and modified in Alfresco; therefore, the aforementioned index tracker will have to gather this information together with the content and model updates.

Alfresco and Solr interaction overview

In the last sections we have unveiled the new repository- index interaction pattern of Alfresco 4. This section describes how this is practically implemented in terms of architectural interaction between the Content and Index (Solr) tiers, which need to exchange content-related information in a number of use cases.

Figure 9 Alfresco <-> Solr interaction diagram provides an overview of the remote integration between the two components. This diagram covers two main use cases:

Alfresco repository using Solr as an implementation of the Search API: that is, routing its search requests to Solr and retrieving results

1. Alfresco issues a REST request to Solr, specifying the query and user/group membership information via request parameters and a JSON request body.

2. Solr retrieves the hits, checks their ACL, and returns them in JSON format.

23

Kathryn MacLean, 07/14/12,
I suggest you keep the two use cases together rather than separating them by the diagram.
Kathryn MacLean, 07/14/12,
I believe this is a result of a previous edit. Should these two words be hyphenated?
andy, 07/14/12,
Change title of graphic to “What happens when you *query* for nodes?”
Page 24: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Figure 9 Alfresco <-> Solr interaction diagram

Solr periodically polls Alfresco and tracks index updates1. Solr issues a scheduled REST request to Alfresco every 15s (by

default) to get index updates, specifying the last indexed transaction.

2. Alfresco returns index updates in JSON format concerning:

a. Node Properties and Content updated since the last index update

b. Content Model changes since the last index update, in order for Solr to start indexing according to new/change types, properties, aspects, and so on

c. Content ACLs modified since the last index update so that Solr can keep them indexed and use them to filter authenticated searches from Alfresco

As you might notice from the picture, the mapping of Alfresco stores (for example, Workspace, Archive) is very straightforward: each store is mapped onto a separate core in Solr and can be configured and tuned separately26.

Thanks to the design changes described extensively in the last sections, the consolidation of the indexing tier on Solr, the scalability of the platform is greatly increased because it is possible to scale out and load balance the indexing tier independently from the content tier. By introducing a load balancer layer in between the Content and the Indexing tier, it is possible to balance requests between multiple physical instances of Alfresco and Solr and vice versa.

The architecture presented in Figure 10 Alfresco 4 Solr Load Balanced architecture shows that this could be useful in two use cases:

1. Alfresco Solr search load balancing This is the most obvious use case for scalability purposes. Search requests are directed through Alfresco to a pool of Solr instances, each of which

26http://wiki.alfresco.com/wiki/Alfresco_And_SOLR

24

Kathryn MacLean, 07/14/12,
Something is wrong with this reference – large space before the figure number.
Kathryn MacLean, 07/14/12,
“content” and “indexing” were not capitalized in the previous sentence. Which is correct?
Page 25: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

contains a full copy of the index and is able to service requests in a purely stateless fashion.

2. Solr Alfresco index tracking balancingIn the other direction, Solr nodes could use a load balancer to redirect their index tracking requests to one or multiple dedicated/shared Alfresco nodes. This could be useful in case of large indexing load, due to a heavy concurrent write/update scenario.

Figure 10 Alfresco 4 Solr Load Balanced architecture

NOTE: An evolution of the scenario in use case 2 was actually used in the benchmarks, for which each Solr node had a dedicated Alfresco instance installed in the same application server (in process) to speed up and evenly distribute the index tracking load. This has proven to be a very beneficial deployment model, as we will discuss in the next sections.

The scalability of the indexing tier is at the base of most considerations around benchmark results in the following chapter. However, for completeness, the following sections provide a brief overview of the other scalability features of Alfresco 4. Despite not being strictly covered in the benchmarks and, therefore, not the primary objective of this document, it is important to understand these features to know the pillars on which to build scalable and reliable Alfresco architectures.

Clustered file system interfacesAs discussed in the section Non-HTTP based Virtual File systems not available in HAmode, Alfresco 3 did not allow clustering of file system interfaces like CIFS and FTP.

Alfresco 4 solved this shortcoming with a strong refactoring of the file systems-related code and the introduction of a new caching mechanism (Hazelcast27). It is

27 http://www.hazelcast.com/

25

Page 26: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

now supported28 to run CIFS and FTP in clustered mode balancing requests to multiple Alfresco nodes.

This allows the deployment of Alfresco solutions that are scalable in a transparent way regarding the use of the UI. It allows a more general, UI-independent, performance requirements definition, which is based on the type of operations performed on the platform.

Alfresco Transformation ServerThe Alfresco Transformation Server29 is a component available as of Alfresco 3.4. It was integrated with Alfresco to provide increased fidelity, reliability, and potential scalability of the transformation tier. This Windows-based (Microsoft Office) application accepts requests for content transformations from the Alfresco content tier. If your scenario is strongly oriented towards document transformations (especially Microsoft Office documents) you should consider this deployment option. Please contact Alfresco Support or your local Solution Engineer for further information around deployment and pricing of this component.

A note on the Alfresco Cloud serviceIn the last section, we discussed how Alfresco 4 Enterprise version delivers a revolutionary approach to ECM scalability for on-premise enterprise deployments. On a parallel track, Alfresco has recently launched a cloud-hosted service at http://my.alfresco.com.

The Alfresco Cloud service vision is to support consumer-oriented and Enterprise collaboration internally and across the firewall in a controlled and harmonized fashion by providing extreme scalability and a hybrid cloud model (with synchronization between private and public cloud, among other features). In other words, the Alfresco Cloud supports and services the quickly growing complex interconnected network of private, public, mobile, and social content that we identify as cloud-connected content. Figure 11 Cloud Connected Content roadmap presents a target roadmap for the cloud service and Figure 12 Alfresco Cloud Connected Content as a service shows the wide potential of cloud-connected content applications that could be built against the Alfresco cloud service in a pure content as a service style.

Figure 11 Cloud Connected Content roadmap

28 As of version ??? . Please check with Alfresco Support for exact support boundaries29http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/ transerv-intro.html

26

Helen Mullally, 07/14/12,
This was changed from “or”.
Kathryn MacLean, 07/14/12,
Need to make sure we are using correct terminology and capitalization. Should this be “Alfresco Cloud supports…” or “Alfresco cloud supports…” or “the Alfresco Cloud/cloud service supports…”?
Kathryn MacLean, 07/14/12,
Capitalization?
Kathryn MacLean, 07/14/12,
Capitalization?
Kathryn MacLean, 07/14/12,
Is this the correct term?
Kathryn MacLean, 07/14/12,
Vague
Helen Mullally, 07/14/12,
Check this is OK to say?
Rui Fernandes, 07/14/12,
And what about sales rep? Contact support for pricing?
Kathryn MacLean, 07/14/12,
What are you referring to with “it”? Wording isn’t correct here.
Page 27: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Although this is an exciting new horizon in the deployment of ubiquitous, super-scale Alfresco solutions on both the virtually infinite cloud resources and space-based architectures30, the cloud initiative is also, to a certain extent, driving the Alfresco roadmap with interesting features that will be ported back into Alfresco Enterprise, especially to support large on-premise deployments.

Figure 12 Alfresco Cloud Connected Content as a service

There are two main product evolutions currently under development for the cloud services that are of specific interest to scalability. These could make it into Alfresco Enterprise in later releases: Transparent Multi-Tenancy and Index Sharding.

Transparent multi-tenancyAlfresco multi-tenancy has been strongly revamped and improved for the Cloud service in order to allow, among others, the following features:

Cross-user authentication between different tenants (or communities, as they are called in the cloud service)

Import/export/location independence of a specific tenant

These improvements would open new frontiers in terms of repository partitioning and geographical distribution, as well as for private cloud deployment scalability.

Index shardingCurrently, in Alfresco 4 each Solr instance deployed in load balancing mode has to keep a full copy of the indexes. Scaling out the Solr tier improves load balancing and, therefore, overall system stability, but cannot per se improve performance of the single search, which is executed anyway against a potentially large full index, present in each Solr node.

One solution to this issue would be to separate the index into separate Solr shards: different subsets of the index deployed on separate Solr cores (potentially in different physical machines), which could then execute a user search in parallel with a clear performance boost due to parallelization.

30 http://en.wikipedia.org/wiki/Space_based_architecture

27

Kathryn MacLean, 07/14/12,
Long sentence. Suggest: “… overall system stability. However, it will not necessarily improve performance…” Also – what is present in each Solr node? Meaning of sentence is unclear.
Kathryn MacLean, 07/14/12,
Prefer “necessarily” – if that is what you’re saying.
Page 28: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Even though Apache Solr supports a distributed search model31, Solr index sharding is currently NOT supported in Alfresco

Enterprise.

Requirements for largely scalable and high performance searches are clear and Alfresco is working in this area to fully support an index sharding solution.

31 http://wiki.apache.org/solr/DistributedSearch

28

Page 29: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

The Alfresco 4 BenchmarksThis chapter introduces the benchmark projects underway and completed by Alfresco on the Alfresco 4 release. Amongst the different benchmark practices, we will mostly focus on scalability benchmarks and Solr versus Lucene comparative benchmarks, providing details about the lab architecture used for running the benchmarks, technologies used, and implemented test scenario. Finally, we will present the raw results for the most relevant benchmarks as input for the scalability analysis presented in the next chapter.

Benchmark typesA full definition of the different potential benchmark types and practices is out of the scope of this document. To fully explain which benchmarks have been run and their characteristics/limitations, we will provide a brief ECM and Alfresco oriented classification of the potential benchmarks. For every benchmark type, an indication will be given on whether they have been run, planned, or are out of the scope of the Alfresco benchmark project.

For the purpose of this document, there are three benchmark types:

1. Scalability Benchmarks2. Comparative Benchmarks3. Tuning/Optimization benchmarks

Scalability BenchmarksThe main objectives of a scalability benchmark are:

Study the evolution of system performance and resource usage by testing against different points in the scalability space, defined by typically growing values across one or more scalability dimensions

Prove the stability and provide an idea of response times when using Alfresco on a very high-end large scale deployment scenario, especially if scalability points are sparse (that is, growth on selected dimensions is massive)

The game rules for a scalability benchmark typically involve:

A fixed benchmark scenario to be run against the different data points

A fixed configuration/tuning of Alfresco and the underlying components to compare performance on like-for-like conditions between different test runs

A discrete number of scalability points to run the test against and record measurements and metrics around performance and resource usage

The following limitations apply to this type of benchmark:

System configuration and tuning is defined a priori and on a best effort basis, and it is not changed per data point. This means that results for each scalability point might be sub-optimal. This is not an issue in the scope of these benchmarks as the main target is to study performance on a statistic basis between the different data points.

Scalability benchmarks were the initial focus of this Alfresco benchmark project and are, therefore, the main subject of this

document.

29

derekh, 07/14/12,
‘the’ to ‘this’
Kathryn MacLean, 07/14/12,
I would reword this
Page 30: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

See the section The Alfresco 4 Scalability benchmark for a detailed description of the scalability benchmark we ran; see the chapter Error: Reference source not found for a full analysis of the results of this benchmark.

Comparative BenchmarksAlthough there are number of potential combinations in a comparative benchmark, typically this type of benchmark has the common objective of comparing platform performance in different conditions, normally using different software, infrastructure components, architecture, or deployment configuration.

The game rules that apply for a comparative benchmark involve:

A fixed benchmark scenario to be run against the different configurations

A variable configuration/tuning of Alfresco and the underlying components to compare performance in different conditions between different test runs

One or a limited number of meaningful scalability points to run the test against, as well as record measurements and metrics around performance and resources usage in the different configurations to compare

Comparative benchmarks were not the main focus of this Alfresco benchmark project. In order to reinforce the considerations from the previous chapter around index tier scalability, we ran and

will present relevant results of an index tier comparative benchmark.

In the section Error: Reference source not found we present the results of this benchmark, which involved running a well-defined scenario (the same used for scalability benchmarks) against Alfresco 4 using the embedded Lucene and the Solr subsystem respectively.

System boundary discovery benchmarksThe main objectives of a system boundaries discovery benchmark are:

Discover the physical scalability limits of a well-defined, optimized Alfresco deployment on a well-defined scenario on very high scalability points

Identify the optimal Alfresco and underlying component tuning for the tested data points

Extrapolate exact sizing information about the maximum load supportable by a given Alfresco deployment

The game rules that apply for a system boundaries discovery benchmark involve:

A fixed benchmark scenario to be run multiple times against different platform tunings

A variable tuning (often not architecture) of Alfresco and the underlying components to compare with previous test runs and offer better performance/stability against growing data points

Incrementally higher scalability points to run the tests against to increasingly stress the system and test different tuning configurations, up to the point that the system’s physical boundaries are reached and no further tuning is measured to improve performance or stability of the system

30

Kathryn MacLean, 07/14/12,
Is there another adjective that could be used here so we don’t repeat “well-defined”?
Kathryn MacLean, 14/07/12,
Is this the same as “Tuning/Optimization” benchmarks? The previous page lists 3 types of benchmarks that will be discussed. It would make sense if the three sections that follow match the 3 benchmark types listed. The first two do match; this third does not.
Kathryn MacLean, 07/14/12,
Something is missing. To compare WHAT?
Page 31: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

System boundaries discovery benchmarks were not the main focus of this Alfresco benchmark project, so there is no exact

detail for optimal tuning in each and every scalability point.

At the time of this writing, a benchmark effort is ongoing to discover the system boundaries of Alfresco on a well-defined scenario, similar to the one presented in the following sections. Results of this new benchmark effort might be added to newer versions of this document or published separately. Please check with Alfresco Support for further information.

Alfresco 4 Benchmark detailsIn the following sections we will provide the results for two types of benchmarks: the Alfresco 4 Scalability benchmarks and the Alfresco 4 Lucene vs. Solr Comparative benchmark.

First we will explain the common details between the two benchmarks. They share most of the definition aspects, such as testing scenario, scalability dimensions in which scalability/data points were defined, the lab hardware on which they were run, and some common architectural characteristics.

ScenarioThe scenario that was tested is the Enterprise collaboration scenario, based on the Share user interface: the scenario was implemented using the Apache JMeter32 technology and the script is available in the Alfresco public SVN33.

Running the scenario involves defining a configurable number of Jmeter threads (virtual users) and that will in turn simulate either a read-only or a read/write user session. The percentage of read-only vs. read/write cycles that are executed is also configurable. For full installation and configuration steps of the suite, refer to the suite README34.

The main rationale behind this scenario is in line with the common objective of the benchmark: to demonstrate Alfresco 4 scalability limits, e specially in the areas where earlier versions would present scalability boundaries, such as the index tier. In this sense, the scenario we are presenting is very search intensive and is mostly focused on document-oriented collaboration (that is, in the Document Library and Search functionalities of Alfresco Share).

In Table 1 Benchmark scenario user operations we provide an accurate description of the steps performed by the JMeter virtual users in the read-only and read/write cycles. Table 2 Benchmark search operations breakdown expands the Search test step in the exact different types of search steps that are performed.Table 1 Benchmark scenario user operations

READ ONLY READ/WRITE

Login Login

32 http://jmeter.apache.org/33 http://svn.alfresco.com/repos/alfresco-open-mirror/benchmark/scripts/SHARE/share-0001/34 http://svn.alfresco.com/repos/alfresco-open-mirror/benchmark/scripts/SHARE/share-0001/V4.0.0/readme.doc

31

Kathryn MacLean, 07/14/12,
Placement of table captions needs to be consistent: either above or below.
derekh, 07/14/12,
Combined paragraphs
Helen Mullally, 07/14/12,
Not sure about this sentence?
Page 32: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

READ ONLY READ/WRITE

User dashboard User dashboard

Site dashboard Site dashboard

Site document library Site document library

Search Upload File

View Document Details View Document Details

Logout Logout

Based on these tables, we will formalize the exact number and type of operations performed by the virtual users following these assumptions:

Transactions are defined as high-level end-user operations and might involve multiple low level HTTP or Database request (for example, Dashlets or document library asynchronous requests are)

Login and logout are out of the calculation scope

The Read Only flow has 4 read transactions and 6 search transactions, while the Read/Write flow has 4 read transactions and 1 write transaction.

With reference to the flows described previously, the Alfresco benchmarks used an 80% / 20% split between Read Only and Read/Write flows. This means that out of X number of users, 80% of X will execute Read Only flows and 20% will execute Read/Write flows.

The Enterprise collaboration scenario used in the benchmarks implements the following user transactions profile:

READ SEARCH WRITE

48% 48% 4%

32

SEARCH

Metadata search on current site

Full text search on current site

Metadata search on “All Sites”

Full text search on “All Sites”

Metadata search on “Full Repository”

Full text search on “Full Repository”

derekh, 07/14/12,
Removed the thought process and just stated the fact.
Helen Mullally, 07/14/12,
Is there something missing here?
Page 33: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Table 3 Benchmark scenario read/search/write split

To calculate the actual load on the Alfresco platform, some additional details are fundamental to characterize the scenario and are, therefore, defined as mandatory input values for the JMeter test suite. The following table shows these details.

PARAMETER DESCRIPTION VALUE

think_time Inter-transaction user wait time in millisecondsDefines frequency of operations

10000 (10s)

write_percentage Defines the split between Read Only and Read/Write flows

20

users_count Number of threads/virtual users to simulate concurrent usageThis parameter was used to change the number of concurrent users hitting the platform.

Depending on test run

full_loops_count Full number of loops to be performed by a thread/virtual userThreads will perform write_percentage Read/Write loops and (100% - write_percentage) Read Only loops.

1000

test_full_time Test full execution time in seconds, including ramp upIf full_loops_count is not reached, threads are stopped gracefully (wait for finish the current loop).

7200 (2h)

ramp_up_time Threads/virtual users ramp up time 600 (10m)

Share asynchronous requests

With the scenario details provided, you should by now have an understanding of which tests were run and which Alfresco Share functionalities were tested.

However, one of the assumptions requires some additional information in order for you to understand how effective these tests are and the load they generate on Alfresco. We defined user transactions at a high-level, ignoring in the calculations the multiple/nested asynchronous HTTP requests typically executed via AJAX by the Share UI upon user clicks.

While we define a coarse-grained analysis entity for the sake of simplifying interpretation of the results, we obviously had to replicate and execute such nested requests (or at least the main ones) to actually execute operations on the repository and simulate a realistic load on the Alfresco and Share tiers.

While a full 1-1 mapping of all the Share AJAX requests goes out of the scope of JMeter and load testing (functional HTTP testing frameworks work best in those

33

Kathryn MacLean, 07/14/12,
Should this be “1:1”?
Kathryn MacLean, 07/14/12,
I made some extensive changes to wording here. Please confirm I haven’t altered the meaning.
Kathryn MacLean, 07/14/12,
Should table have a caption/label?
Kathryn MacLean, 07/14/12,
Again, be consistent with caption placement.
Page 34: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

scenarios), using a JMeter proxy35 we recorded the heaviest/more meaningful nested AJAX requests impacting performance and scalability of the platform and we embedded those in the main user interaction test requests. In terms of measurements, in line with the stated assumptions, we will generate only the main sample (time for main request + nested requests) rather than measuring the time for each single nested request.

SITE DASHBOARD

Site Dashboard main request

Dashlet “Recently Modified Docs”

Dashlet “Site Activities”In the following tables, we detail the nested requests in each test step to give an idea of the actual Share load simulation performed in the benchmark scenario.

USER DASHBOARD

User Dashboard main request

Dashlet “Recently Modified by Me”

SITE DOCUMENT LIBRARY

Site Document Library main request

Site Tags (left sidebar)

Site Document Library data

Site Tree node (left sidebar)

Despite not being a full simulation of each and every Share request, you can see from these tables that we have reconstructed a very large portion of the Share document-oriented collaboration pattern, which serves very well for the scalability/statistic purposes of this document.

Scenario limitations

While applying the results of this benchmark to real use cases, please keep in mind the following limitations:

Not all interfaces are being used: We only tested Share so applying scalability considerations to other UIs should be done with care, or avoided altogether.

Document-oriented collaboration only: In Share we focused on document-oriented collaboration; therefore, scalability considerations on other Share features (for example, forums, wiki, and datalists) should be deferred to a future benchmark or private dedicated load test.

35 http://jmeter.apache.org/usermanual/jmeter_proxy_step_by_step.pdf

34

SEARCH

Search main request

Load results YUI request

Kathryn MacLean, 07/14/12,
Why italicized? Other dashlet names are not.
Page 35: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Not full AJAX interaction reproduced: Only the main AJAX Share requests were reproduced, so if case scalability issues are experienced with other nested Alfresco functionalities, further investigation might be required.

Scalability dimensions Also common to all Alfresco-run benchmarks was the selection of three scalability dimensions. These scalability dimensions define the space in which to identify scalability points against which to run the Enterprise collaboration scenario:

Number of concurrent users

Content items in the repository

Number of Alfresco repository cluster nodes

As shown in Figure 13 Benchmark scalability dimensions, we identified scalability points (or data points in the standard testing jargon) in a three dimensional space defined by the three dimensions:

Figure 13 Benchmark scalability dimensions

Concurrent users

This is the most intuitive dimension to test any platform against: it involves running the defined scenario with a different, typically increasing, number of virtual threads simulating increasing user concurrency on the system.

As we will see in the specific benchmarks, we scaled this dimension up to 1100 concurrent users, creating 1100 in the repository and 1100 associated Share sites, and then running the scenario with 1100 concurrent virtual threads. We will show how Alfresco performed in this scenario, proving scalability across the concurrent users dimension.

Content in repository

For an ECM platform, and especially looking at the huge scale implementations that are nowadays based on Alfresco, scalability across the content dimension is fundamental. Therefore, we loaded the Alfresco repository at different stages (up to 10 million documents) and ran the scenario against the loaded repository.

35

Kathryn MacLean, 07/14/12,
Too many “which”es here. Possible change this first one to “… define the space where you identify…” but I’m not sure if that says what you want it to.
Kathryn MacLean, 07/14/12,
Wording is awkward. Suggest “Full AJAX interaction not reproduced”
Page 36: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

In order to define a realistic scenario, which would reproduce a potentially real use case while still stressing the underlying Alfresco components, we implemented the following repository loading strategies:

Content was bulk loaded using the Bulk File System Import tool (available in Google Code36 and integrated with Alfresco 4.037) prior to test runs and with no specific bulk loading tuning

Content was loaded in Share private sites to stress the ACL checking process, especially the Solr in-query results ACL checking feature

You can find some indicative results of the bulk loading process (up to 10 million documents) in the section Error: Reference source not found.

Additional details about the content loaded can help explain the exact scenario that was reproduced in the benchmarks:

Bulk loaded content had an average size of 250kb, a common office file size when working in collaboration environments

Loaded content had mixed MIME types, especially involving typical office document MIME types, such as Word (.docx), Excel (.xls), Powerpoint (.pptx), PDF, JPG, and plain-text. This was implemented to simulate the typical Alfresco transformation and indexing components load in Enterprise collaboration environments.

One private site per user was created by the test setup procedures to:

a. Provide an area with appropriate permissions for virtual users to upload documents

b. Fragment the ACL matrix when performing local/global searches

c. Simulate an Enterprise collaboration environment with several Share sites

Alfresco cluster nodes

The third dimension chosen for the scalability analysis was the number of Alfresco repository cluster nodes deployed in the architecture. Traditionally, Alfresco has supported horizontal scalability. This means that testing against an increased number of Alfresco repository nodes is a good way to scale out the platform and, therefore, support larger number of concurrent users or content in the repository.

One fundamental consideration must be covered on this third dimension. As introduced already, Alfresco 4 balances the content management and indexing loads between the repository and the Solr tiers: following the common sense, scaling out the repository tier because of required additional load on the platform typically requires scaling out the Solr tier, if the scenario is search intensive enough.

While scaling out Solr was required during the benchmarks – and we will provide indication of the data points for which it was required – Solr’s comparative scalability was not the main focus of this benchmark. In other words, we will provide information on how many Solr nodes were needed in each data point, but no comparison is

36 http://code.google.com/p/alfresco-bulk-filesystem-import/37 http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/Bulk-Import-Tool.html?resultof=%22%62%75%6c%6b%22%20%22%6c%6f%61%64%22%20

36

derekh, 07/14/12,
Change ‘scaling up’ to ‘scaling out’.
Rui Fernandes, 07/14/12,
I am not sure you want to talk about scaling up or scaling out. I would say you want to talk about scaling out. That’s why you mention horizontally…
Kathryn MacLean, 07/14/12,
I don’t think this is necessary. Delete it?
Kathryn MacLean, 07/14/12,
Is this necessary?
Page 37: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

available between the performance on the same scalability point against different Solr configurations.

In this sense:

While we will provide detailed setup information for platform scalability in a very search-intensive scenario,

repository tier scalability – not index tier scalability – is the 3rd scalability dimension used in the benchmarks.

MetricsThe metrics definition and collection process is also common between the benchmarks presented in this document. In this first version of the benchmarks, we have collected the following metrics for every test run:

Figure 14 Responses over time JMeter plugins report

Availability Metricso Error rate per user operation/transaction (HTML reports)

o Aggregate error rate (HTML reports)

Performance Metrics o Average response times per user operation/transaction (HTML

reports)

o Aggregate average response times (HTML reports)

o Responses over time38 (and other JMeter plugins39 graphs); an example of this is provided in the following resource load metrics

Resources Load Metrics o Per every machine

38 http://code.google.com/p/jmeter-plugins/wiki/ResponseTimesOverTime39 http://code.google.com/p/jmeter-plugins/

37

Helen Mullally, 07/14/12,
HM added this part of the sentence. Check this is correct?
Page 38: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

sar40 utility report throughout

JMX (JConsole) overview of CPU, memory, heap loaded classes, and threads as shown in Figure 15 JConsole load graphs

o DB and file system usage statistics

Figure 15 JConsole load graphs

In addition to these metrics, we also collected the Bulk File system Import tool reports during the bulk-loading phase; we will briefly present them in the section Error: Reference source not found. In the following sections, we will present meaningful excerpts of these results for the scope of studying scalability. For additional information on the results, please contact Alfresco Support.

Benchmark lab infrastructure

Benchmark lab hardware

Up to 15 machines were used during the benchmarks:

1 machine for the Balancing Tier

1 machine for the Web Tier

Up to 4 machines for the Content Tier

Up to 4 machines for the Index Tier

1 machine for Data Tier

1 machine dedicated to host the Shared Content Tier for the cluster nodes

40 http://linux.die.net/man/1/sar

38

Kathryn MacLean, 07/14/12,
Do you mean “purpose”? If so, rework as: “… we will present meaningful excerpts of these results to study scalability.”
Kathryn MacLean, 07/14/12,
Odd formatting – I have flagged this elsewhere. Not sure why there is a gap before the Figure reference.
Page 39: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Up to 3 Client machines in the Client Tier

Table 4 Benchmark Lab hardware details provides machine specifications.

TIER NODES CPU RAM HDD NOTES

Balancing lb1 Quad Core Xeon L5520 (2.26GHz) 6GB 2*600GB

RAID 1

Software load balancing between Share Alfresco Alfresco Solr

Web st1 Dual Hex Core X5650 2.66 GHz 24GB 4*128GB

RAID 5Users main entry point UI

Contentat1 at2 at3 at4

Dual Hex Core X5650 2.66 GHz 24GB 4*600GB

RAID 5Alfresco repository cluster

Data dt1 Dual Hex Core X5670 2.93 GHz 48GB 4*600GB

RAID 5For DBMS (PostgreSql)

Indexsn1 sn2 sn3 sn4

Dual Hex Core X5670 2.93 GHz 48GB 4*600GB

RAID 5For load balanced Apache Solr

Shared Storage

ms1 Quad Core Xeon L5520 2.26GHz 6GB

6 x 1TB 7k SATA

3.5

Exposing an NFS Share mounted on Alfresco nodes

Clientcld1 cld2 cld3

Dual Quad Core Xeon L5520

2.26GHz12GB

146 GB 15K SAS 3.5 x 2RAID 1

Reproducing virtual usage running JMeter on Windows 7

Table 4 Benchmark Lab hardware details

All machines used were physical41 and the network connection between machines was realized with Gigabit Ethernet technology.

Benchmark lab software

Table 5 Benchmark Lab Software details provides information on the main software components.

TIER MACHINES OS RELEVANT SOFTWARE DETAILS

Balancing Tier RHEL5 Apache Httpd 2.3 mod_proxy and mod_proxy_ajp42 to use Httpd to balance requests

Web Tier RHEL5 Alfresco Share 4.0.0 (792)

Alfresco Tier RHEL5 Alfresco 4.0.0 (792)Apache Tomcat 6.0.29

Using Shared Storage for shared content store via NFS

41 Dell R710: http://www.dell.com/us/enterprise/p/poweredge-r710/pd42 http://httpd.apache.org/docs/2.2/en/mod/mod_proxy_ajp.html

39

Page 40: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

TIER MACHINES OS RELEVANT SOFTWARE DETAILS

DBMS Tier RHEL5 PostgreSql 9.0.4 Single node deployment

Index Tier RHEL5 Alfresco Solr 4.0.0 (792) Indexes on local RAID 5 disk

Shared Storage RHEL5 NFSd43 Exposing an NFS Share mounted on Alfresco cluster nodes

Load Test Client Drivers

Windows 7 Jakarta JMeter 2.5.1 Running JMeter on Windows 7

Table 5 Benchmark Lab Software details

The Alfresco 4 Scalability benchmarkThe Alfresco 4 Scalability benchmark was the largest effort of the benchmarking project and the objective of this document. This section provides details of tests run, architecture, and configuration and results, while we leaveError: Reference source not found, detailed considerations, and recommendations to the next chapter.

ScopeThe benchmark should be as realistic as possible and representative of a large-scale Enterprise-wide deployment of Alfresco Share as a collaboration platform.

The objective of the Alfresco 4 Scalability benchmark is to study Alfresco performance and resource load variations while running the Enterprise collaboration Scenario using different scalability points: #content, #users, and #clusterNodes.

A stretch goal of this benchmark was to prove horizontal scalability of the Alfresco repository tier as a means of coping with more content and users.

Out of scopeThe following was not in scope:

Study optimized configuration/tuning for each scalability point: Configuration was defined a priori and retained between test runs to analyze scalability on the selected dimensions.

Index tier horizontal scalability: While Solr had to be scaled up to 4 nodes in certain scalability points, the Index Tier should be considered a black box for the scope of this test.

Scalability points (data points)With respect to the space identified by the Scalability dimensions, each scalability point can be identified by a triplet: (#users, #content, #clusterNodes)

As shown in Table 6 Alfresco 4 Scalability benchmarks scalability points are therefore identified by selecting discrete values in each dimension. The test scenario is then run iteratively against linear combinations of the points identified across each scalability dimension in order to produce comparable results. If additional content is required in the repository, a phase of bulk loading (and Solr index tracking) is performed before the next test run.

43 http://linux.die.net/man/7/nfsd

40

Kathryn MacLean, 07/14/12,
Can’t get field to update (should be Table 6). Also, introductory sentence doesn’t make sense. “In Table 5…” or “Table 5 shows…” I’m not sure what “are therefore identified” is about.
Kathryn MacLean, 07/14/12,
Reword
Kathryn MacLean, 07/14/12,
Not in scope for this document or this benchmark?
Kathryn MacLean, 07/14/12,
Is this a common industry term? I know what you mean but the wording is unusual to me.
Kathryn MacLean, 07/14/12,
It is not clear to the reader that “Scenario” is a cross-reference (hyperlink). If it is important to reference this section, it should be done differently. It think the link should be removed and “scenario” should not be capitalized.
derekh, 07/14/12,
General fix up.
derekh, 07/14/12,
It’s never ‘obvious’.
Page 41: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

DIMENSION SELECTED POINTS NOTES

Concurrent Users (x) 180 360 720 1080

0 is the baselineScale up to 1080 users required the 3 client load drivers

Content in repository (y) 0 500K 2M 10M

0 is the baselineContent loaded using BFSIT

Alfresco Cluster Nodes (z) 1 2 4

To study effects of repository horizontal scalabilitySolr was scaled as required up to 4 load balanced nodes

Table 6 Alfresco 4 Scalability benchmarks scalability points

The scalability test was not actually run against all the potential scalability points identified by the dimension values above (x*y*z = 4*4*3 = 48 total test runs), while a meaningful subset of those scalability points was identified to run the test against. The selection was done with the following main rationales:

Controlling the benchmark execution complexity

Avoiding redundant data collection

Focus the effort on realistic deployment scenarios

In

Concurrent users / Content in repo we provide an overview of the selected scalability points. This resulted in running the test scenario against 33 different triplets:

CONCURRENT USERS /CONTENT IN REPO

180 360 720 1080

0 docs (baseline) 1 node 2 nodes 4 nodes N/A

41

derekh, 07/14/12,
Table reference not correct. Besides, the table on the next page (the colourful one) should be used here and then reused by “A note on Solr usage”
Kathryn MacLean, 07/14/12,
Something wrong with sentence. Doesn’t read well to say “was not actually run… while a meaningful…” Meaning is not clear.
Page 42: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

CONCURRENT USERS /CONTENT IN REPO

180 360 720 1080

500k docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes 2 nodes

4 nodes 4 nodes 4 nodes 4 nodes

2M docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes

4 nodes 4 nodes

10M docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes

4 nodes 4 nodesTable 7 Benchmark Tests overview matrix

A note on Solr usageAlthough this topic is not the main objective of the benchmark, it is important to understand which index tier configuration/architecture was used throughout the benchmark. It is also helpful to know how an Enterprise infrastructure can cope with large-scale, highly search-intensive scenarios like the one implemented in the benchmark.

Based on the flexible load-balanced architecture described in the next sections, Solr nodes were added as required by the test. In parallel with

Concurrent users / Content in repo, we provide an indication of how many Solr nodes were used in each test run:

42

Kathryn MacLean, 07/14/12,
Why are some cells italicized and others not? Need to indicate why these are being highlighted, if there is a reason. If not, all cells should be unformatted text.
Page 43: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Legend :

1 Solr node

2 Solr nodes

4 Solr nodes

Table 8 Solr nodes usage throughout test executions

We can do a few considerations based on

:1. The vast majority of scalability points were covered with a single Solr node.

2. In both cases, despite being in the same (#users, #content) scalability area, the need for increasing the number of Solr nodes was originated by a step up in the Alfresco repository cluster nodes dimension. This suggests that Alfresco fully acts as a search client for Solr, which means that if the Alfresco

43

CONCURRENT USERS /CONTENT IN REPO

180 360 720 1080

0 docs (baseline) 1 node 2 nodes 4 nodes

500k docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes 2 nodes

4 nodes 4 nodes 4 nodes 4 nodes

2M docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes

4 nodes 4 nodes

10M docs 1 node 1 node 1 node 1 node

2 nodes 2 nodes 2 nodes

4 nodes 4 nodes

Kathryn MacLean, 07/14/12,
Not sure what you are saying here.
derekh, 07/14/12,
This should be ‘Benchmark Tests Overview Matrix’ and ditch the black and white version.
Kathryn MacLean, 07/14/12,
Reformat as necessary so the colour blocks line up.Again – why is some text in italics?
Page 44: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

repository is maxed up, scaling out Alfresco nodes typically often requires to scale out Solr proportionally.

3. The proportionality factor between Alfresco and Solr scalability is tightly related to the read/search/write/split of the scenario. This is discussed in more detail in the section Error: Reference source not found.

ArchitectureUsing the Benchmark lab infrastructure, the architecture was built on the following assumptions:

Architecture should provide high availability and be based on the Alfresco Simple Repository Clustering model44

Optimized deployment for performance and load balancing, targeting to scale to the most challenging scalability points without changing the deployment scheme

Architecture should be easily reconfigurable and flexible to reconfigurations of the Alfresco cluster and underlying components in order to investigate in the #clusterNodes dimension

Under these assumptions, the logical architecture that was designed involved the following layers and separate lab machines:

Client Tier: includes the Load Test Client drivers machines running the tests

Web Tier: includes the single instance used of Alfresco Share

Load Balancing Tier: leveraging Apache Httpd as a balancer from Share to Alfresco and from Alfresco to Solr

Content Tier: including the 1 to 4 instances of Alfresco repository

Index Tier: including the 1 to 4 instances of Solr servers

Storage Tier: including a dedicated machine for DB and one for NFS shared storage

A high-level logical architecture diagram is provided in Figure 16 Scalability benchmark logical architecture.

44http://wiki.alfresco.com/wiki/ Cluster_Configuration_V2.1.3_and_Later#Simple_repository_clustering

44

Kathryn MacLean, 07/14/12,
This bullet point doesn’t follow the same sentence structure; not sure how to flip it around
Page 45: Introduction Web viewThe Alfresco Out of the Box content model and ... They can have modular architectures that allow you to simply add more ... the word “scalability” refers

Figure 16 Scalability benchmark logical architecture

45

Kathryn MacLean, 07/14/12,
Bottom right corner of diagram: “Strorage” should be “Storage”.