Architecting for the Enterprise
no text exists for this slide
no notes exist for this slide
Who am I
no text exists for this slide
no notes exist for this slide
Who is CONTENS
no text exists for this slide
4 founders one of them is here with me at our booth
no other vendor in Europe has more installations
Build predominately with CFML
Social Software â Relate
30 high educated employees, more than 15 certified CF developers (most of them with 5-10 years experience)
Home of Oktoberfest, Leder Hosen, BMW, and the best beer in the world.
CONTENS the beginning
no text exists for this slide
Here is a small throwback to the ghost of CONTENS past.
In this talk I will concentrate on the architectural design considerations related to CONTENS Version 3. As can be seen the jump in LOC from 2.5 to 3 is several orders of magnitude larger. To be fair here the difference between 2.5 and 3 version is a complete re-architecture. A lot of thought was put into the re-architecture in this version.
What is Enterprise Software
no text exists for this slide
What is Enterprise Software?
Proprietary
Expensive
Complex
etc
There is enterprise software and then there is enterprise software. Not everyone's definition of enterprise software is the same. It is a term that is often overused, abused and sometimes even used as an insult for âoverly complex softwareâ or software that is overkill for smaller organizations.
Enterprise Software
no text exists for this slide
no notes exist for this slide
The Three Pillars
no text exists for this slide
Taking the previous quote we can extract âthe three pillarsâ of enterprise software that being:
Performance
Scalability, and
Robustness
...Usability...
Those are certainly not the only criteria for enterprise software but without those three you are lost. I would add as another point usability, you can have a superb performing, scaling and robust piece of software but if it has crappy usability you can forget about user acceptance.
transition: how can we achieve the 3 pillars of enterprise software? We need to have a framework by which we think about software development. When designing software or any concept we have a set of principles that guide us e.g. we first set some goals, then we evaluate and validate whats availble to use, end-to-end peformance tuning, and finally continually measure and evaulate.
Performance
no text exists for this slide
Performance
Can we design with performance in mind?
You betcha!
Performance Design
no text exists for this slide
design course grained services (A rule of thumb is to build a series of lower-level, fine-grained objects and components and then build more coarse-grained services or facades out of them.) By using this approach you provide looser coupling, and increase your ability to encapsulate change.
Validate user input early â if possible validate all user input on the client side, this doesnât meant you shouldnât validate the input on the server side but client side validation is going to catch most of the common use input errors and reduce the number of round trips to the server for validation
Iminimize round trips â database, client<>server, network etc. This comes back to the idea of course-grained services by batching calls together so that you peform a single logical operation in a single round-trip
If possible when writing to the file system do it asynchronously
use caching â if it costs a lot to retrieve it, cache it!
Performance Design II
no text exists for this slide
1) reduce contention for critical resources (db, file system). When aquiring a resource do it for the least possible amount of time
2) Process independent tasks concurrently â when you have multiple independent tasks that donât rely on information from another or cross block each other then run them asynchronously or use a separate thread to fire and forget. Remember that you will benefit most when the tasks are IO bound, tasks that are cpu bound suffer from diminishing returns when trying to scale up.
(Amdahls law). The classic example is if you can parallelize 70% of a program by running it on 4 cpus intead of 1 and run that through Amdahls equation we end up with a value of 2.105, if we now double the cpuâs available we only gain about 1/5th more performance where we would think that the performance should double.
Performance Caching
no text exists for this slide
Performance through Caching
We all know about caching, and of the important role it plays in boosting performance and scalability. The fact that there are two dedicated talks on caching at this years CFUNITED is a testament to its importance.
Performance Caching II
no text exists for this slide
Caching
Firstly Why does caching play such an important role? Caching can help us with two points in our âthree pillarsâ of enterprise software (Performance and Scalability). How does it do that?
We boost performance by storing data that is expensive to fetch in a place that is less expensive to fetch (Ram)
We boost scalability (scale out) by reducing load and contention on critical resources such as the DB, Network, Application Server (instantiation of components), and web server.
2) We can cache at the database level (queries), the Application level (objects/components/page fragements, entire pages),and the Web-Server/Client (css, js, images)
When should we cache? When its global, frequently used, requires a lot of processor power, or when its relatively static
3) Typical for CF we have a lot of options when it comes to where we can cache.
You can use the native caching methods available in coldfusion, but its not very flexible and we donât have much control over it.
You can build your own caching mechanism (or use a framework that comes with one e.g. coldbox, or use softcache
or you can use a 3rd party distributed caching system that can be integrated in your application such as memcached, ehcache, jbosscache, jcs etc
Performance Caching III
no text exists for this slide
Caching
Architecture: Inprocess / Out of process.
inprocess caching: is faster than out of process caching but you are restricted in size for 32 bit systems, not much of an issue on 64 bit. Scale up only. Its also what comes bundled with ACF9 and Railo
out-of-process: slower than in-process, no limitation in size for 32 bit systems, requires objects to be serialized, scale out to infinity
Algorithms: Cost based ( fifo, lru, lfu, mru) / time based algorithms and a whole slew of other types of algorithms. - [Railo EhCacheLite comes with LRU, LFU and FIFO algorithms]
Strategies: Deterministic / Non-deterministic: most caching strategies use a non-deterministic approach (check cache before going to db)
Performance Caching in CONTENS
no text exists for this slide
Caching â CONTENS Decision
After evaluating the caching options available, the different algorithms and strategies CONTENS decided to:
Build a caching façade to allow us to support multiple caching methods, and without much effort continue to expand when new and better caching comes on the market.
The first and simplest cache available is the default cache which stores cache items in the application scope in a two level structure with time-stamps to allow for a time based eviction policy
SoftCache using java soft references was the next to be implemented in order to save on memory
Some of our customers run on the open-source Railo CFML engine so we built a module to take advantage of Railoâs cluster cache
We also have a memached module that uses the java-client, generally used in a clustered environment
and lastly and most recently EHCache for both CF9 and Railo customers â in process caching
Show the caching facade in code?
Performance Code Generation in CONTENS
no text exists for this slide
Something else that CONTENS does not only for performance reasons is generate code.
Code is generated using a combination of templates, snippets.
Tablebot is CONTENS own ORM, it works bottom up, that is from the database table to CFCâs. And generates DAO styled CFCâs. The gains from tablebot are more in the area of development performance (since we are removed from writing any sql outside of any custom stuff that may be needed). It also greatly reduces errors and provides us with a central location if we need to make any tunings we tune the base code and regenerate the cfcâs.
Formbot as the name implies generates gui forms for crud methods on tables, as well as for extending CONTENS through applications or modules. Some cmsâs rely on either some kind of meta model storage of the forms in the form of xml or have developed their own special markup language to build the forms. CONTENS provides the formbot to basically click together a form using the gui. This provides us with almost complete control of the codebase and provide consistent functionality.
ObjectClass Generator. Objectclasses in CONTENS are internal content types e.g. Press release, article, product, teaser etc. These object classes can be âclickedâ together, generated and deployed. The generation also uses formbot to generate a form to edit/update the contents of the newly created objectclass. Here again the class is generated into a cfc file that handles all the crud work (without sql), along with the language .propertyfiles and the view files. And lastly a very basic template to allow you to add the new content object to a page. [demo]
Performance Code Generation Demo
no text exists for this slide
Iâd like to give you a very simple demonstration of object class generation
Performance Monitoring amp Testing
no text exists for this slide
Performance - CONTENS
Remember that optimizing a single feature could take away resources from another feature and hinder overall performance.
You obtain the most benefit from performance testing when you tune end-to-end, rather than spending considerable time and money on tuning one particular subsystem. Identify bottlenecks, and then tune specific parts of your application.
In order to perform end-to-end performance testing you need some tools eg:
Fustion Reactor, log file analysis â Fusion reactor plays a major role in our ongoing performance monitoring along with error log analysis to proactively find any trouble spots that could start affecting application performance, including errors not part of the application itself, like network or io issues or issues affecting the application through 3rd party applications modules, or even external databases.
Maat Kit, MSSQL Query analyzer. No body is perfect and when it comes to writing queries for 5 different databases you really have to be careful, what runs super fast in mssql could be dead slow in mysql or postrgre and visa versa. Having huge test databases to test your queries against is a big help to catching problems before they become problems.
jMeter, Web Capacity Analysis â we also use jMeter and MS web capacity analysis tools to attempt to simulate heavy loads on the system. Nothing really compares to type of load that happens in the wild. We are thankful that we have a customer that generates the best metrics, better than any test plan we could have written.
Performance Monitoring amp Testing II
no text exists for this slide
[Need to change this to the full week]
Here is a good example of the amount of traffic at one of our customer sites during a week in April. ~50k page impressions per day is not a big deal right? Well in this case the traffic here is on the editor server, only editors work on this server, no internet traffic at all.
The visitors section is a little misleading, they have about 200 or so active editors but some are coming in over VPN, and none of them have fixed IPâs
We have almost reached the capacity of this server and are currently planning scaling out editor system by setting up a cluster. That brings us to the next Pillarâ¦
Scalability
no text exists for this slide
What is a Scalable System?
A scalable system is a system whose performance improves after adding hardware, proportionally to the capacity added.
In other words
If I can add more hardware and the performance of my application improves then I have a system that can scale. The level of scalability, however, depends on how the application was architected.
Scalability Pyramid
no text exists for this slide
The Scalability pyramid.
Hardware, software and tuning only play a small part in the ability for software to scale well. The base of the pyramid is design and it has the greatest influence on the scalability.
Scalability must be a part of the design process because its not piece you can add later. Scalability is a balanced partnership between software and hardware. The design choices made will either contribute to or inhibit the ability for your application to scale.
How do you design for scalability is similar on how to design for performance, there is one caveat though, performance and scalability sometimes come into conflict with each other, more on that later.
Scalability Design
no text exists for this slide
As we saw in the previous slide scalability is a design decision. If you donât design for scalability you cant use hardware to scale.
AS with performance we need to minimize resource contention, itâs the root of all evil when it comes to scalability. By resources in this case I mean memory, cpu cycles, bandwidth, and db connections.
Use threading, as I mentioned previously here you gain the most benefit with threading when doing IO access tasks because of diminishing returns.
A process that waits is a process that isnât doing anything constructive and just holding on to resources that other processes may need.
Naturally not all processes can be performed asynchronously. For long running processes you can use a queue (activeMQ, RabitMQ) to process a task. In CONTENS, publishing tasks can be asynchronously added to a queue, the queue is then worked on by one or more workers.
you can gain a lot of scalability if parts of your system can be distributed, when capacity of a single machine no longer meets the needs of the user, a processor intensive aspect of the application can be separated onto another machine thereby relieving a portion of the load.
Scalability Scaling Methods
no text exists for this slide
Hardware scalability.
Distribution lets you farm out tasks that require heavy cpu or file system access freeing up resources to provide a smooth reactive user interface. For instance a server for the user interface and its tasks, a publishing & distribution server, a media server,
[This last issue is probably something unique to CONTENS it came about with the need to scale images to user-defined sizes so you can show a thumbnail of an image in a teaser and a larger sized image in the full page, or perhaps in several sizes for several different locations. Since contens primarily publishes static html pages the images need to be resized and distributed to the live server. The more custom sizes a customer needs the more images that get produced and the more cpu required to produce the images as well as the more diskspace. The problem is that not all sizes are always needed but were always produced, and distributed reducing performance and increasing publishing times. The image servlet creates the image only when it is actually used and published. This reduced unnecessary image manipulation and distribution.]
Slide 23
no text exists for this slide
Scaling up:
Scaling up is simply moving from a smaller sized server to a larger one. Larger meaning more cpuâs, more memory, faster HDâs.
Plus side: no change in administration, no code changes required
Down side: As I mentioned earlier adding cpuâs doesnât add performance linearly you have to deal with diminishing returns. In this constellation you also have a single point of failure.
Scaling out:
Contrary to scaling up, scaling out distributes the load across multiple smaller machines.
Plus side: You can gain fault tolerance, or failover protection, use cheaper machines, much more linear performance increase.
Down side: more servers to manage.
Scalability Clustering
no text exists for this slide
In this first scenario we are simply clustering the main application layer to distribute load, using either software or hardware load balancing and sticky sessions
Scalability Clustering II
no text exists for this slide
Here is a similar scenario but with the database clustered as well as the editor systems
Scalability Distribution
no text exists for this slide
In a distributed configuration specific systems are hosted on separate servers (could also be instances or vms). Here we have a single editor instance with a publishing server responsible for generating and distributing the assets/files and a media server responsible hosting well the media.
We could also combine this topology with the previous two slides so that we end up with.
Scalability Distribution II
no text exists for this slide
In a distributed configuration specific systems are hosted on separate servers (could also be instances or vms). Here we have a single editor instance with a publishing server responsible for generating and distributing the assets/files and a media server responsible hosting well the media.
We could also combine this topology with the previous two slides so that we end up with.
Scalability vs Performance
no text exists for this slide
Perhaps you are asking yourself how can there be a competition between scalability and performance?
Parallelization: When I talk about parallelization Iâm talking more about multiple simultaneous or parallel requests which is becoming a lot more typical with the Ajax model
Scenario: A typical scenario for us is the publication of content. What happens if we have several people pushing the publish button, the page that they are publishing can have a cascading affect (meaning other *many* pages can be published). With all the activity going on in the database the potential for a deadlock increases with the number of simultaneous requests that affect the same tables. A critical mass will eventually be reached where parallel requests can cause problems. At some point a decision has be made: enter request serialization or message queues.
Serialization or messaging (enterprise service bus). By using a messaging system like ActiveMQ the editors fire off publishing requests and they land in a queue, the queued messages are then processed sequentially virtually eliminating any possibility of contention for database resources.
Trade-offs:
parallelization = Amdahl law
Serialization = By using a message queue you introduce another layer into the system architecture and potentially additional overhead and/or delays but what you possibly trade for in upfront performance losses you gain in scalability. There is also the possibility of providing more than one message consumer to work on a particular queue.
Robustness
no text exists for this slide
What is Robustness in relation to software development?
âRobustness is the quality of being able to withstand stresses, pressures, or changes in procedure or circumstance. A system is said to be ârobustâ if it is capable of coping well with variations (sometimes unpredictable) in its operating environment with minimal damage, alteration or loss of functionality.â
It is the ability for software to be able to deal with every conceivable error, no matter how unlikely.
Software tends to be built with an expectation on the user to follow what developers consider to be the sensible course of action, and as we all know users are NOT sensible.
we can, however, design and build our software in such a way as to handle âmostâ of the input variations.
Once we think we have done that then we need to have tests to verify that we have accomplished it. Its important that we donât build software simply to pass the tests, then all we have is software that meets the quality of the tests. If the quality of the tests is poor wellâ¦.
Robustness, in my opinion, is a result not only of good architectural decisions but also having the appropriate testing methodologies in place.
Robustness Design
no text exists for this slide
Validate the inputs â in my world there is no such thing as âgarbage in -> garbage outâ. Donât let the garbage in the door in the first place. I mentioned previously that we should validate at the client side to reduce the amount of round-trips to the server, this doesnât mean that the input isn't validated at the client as well. Input needs to be validated at ALL levels of the system. If you expect something to be there and donât first check to see if its really there then you have a problem when its not there.
âShit happens, life goes onâ if the db server goes down you are technically dead in the water, nothing can be saved or read, but the user doesnât need to see an ugly stack-trace, the user interface should actually still function to an extent that the user is informed of the problem. Same thing with a file share, ftp or sftp host thatâs no longer reachable, life goes on deal with the problem gracefully
Make sure all problems are logged and that the logs are actually periodically checked, the logs should also contain enough information for you to actually make use of the information (stack-trace, method name etc)
Robustness Testing
no text exists for this slide
Both functional and non-functional testing needs to be performed to establish how robust the application is.
Functional : We first need to test our software to make sure that it functions as expected, âthat the features actually work as expectedâ.
Non-functional: does not relate to a specific feature or action but more to the scalability, performance and security of the software (# simultaneous users?, can it be hacked?, what is the threshold?)
Robustness Testing II
no text exists for this slide
No one single testing method is by itself good enough to catch every error in a program as itâs pretty much âimpossibleâ to evaluate every single execution path.
Tests a specific unit of code at the functional level. For unit testing we use mxUnit and cfUnit (although it appears that development of cfunit has stalled), white box method
Fuzz testing is testing with random, invalid, or unexpected data and is typically a black box testing method
integration testing helps to find problems or defects in the interaction of different parts or modules of the software
answers the question âdoes it fulfill the requirements?â
helps find problems introduced after code changes. Here we use ANT builds and a number of unit tests/assertions.
At the functional level we also do code walkthroughs or peer-reviews. We religiously run scope-checker, queryparam scanner
Robustness Testing III
no text exists for this slide
As I mentioned before non-functional testing donât test any specific feature of functionality of the software but tests it as a whole using various methods.
Load Testing â now we are getting to what most of us understand as Robust â how much traffic can this baby handle and still keep standing! But staying standing up doesnât mean the fight isnât over, if your software can handle hundreds or even thousands of simultaneous users but doesnât continue to function as expected (meaning incorrect or unexpected results) is the software still considered to be robust? Nope!
Security testing: is a topic on its own, needless to say that it needs to be done, and its often better to get a 3rd party to do final security testing on your software before shipping/releasing
Usability testing: at the beginning of my talk we set the goal of intuitive and usability as a goal when building the software, if you spent the time to prototype and evaluate your designs before starting laying down code usability testing can be used to finalize the validation of the user-inteface design. Basically is the final result as designed.
l18n testing: if you are building software that can handle multiple languages you need to also test that it can handle the different character sets without breaking, what about left-to-right text?
The three Pillars of Enterprise Software
no text exists for this slide
In this presentation we addressed the 3 pillars of enterprise software and how they drive the decisions we make when designing application and/or software
CONTENS Clients
no text exists for this slide
[update list to include customers in USA]
Clients in nearly all branches and segments
Production, Services, Media,â¦
Technical Details
no text exists for this slide
[Optional â may or may not include in presentation]
Some Tech specs of CONTENS
Thank you Any Questions
no text exists for this slide
no notes exist for this slide
Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which we all need, thanks for all the enthusiasm to offer such helpful information here.
Posted By: Christian Louboutin sale On: 09/22/11 8:29 AM