Tuesday, January 13, 2009

Terracotta Usage in Ecommerce

As this Computer-world article indicates, even as late as 2008, most E-Commerce sites have not yet solved their scale and availability issues, resulting in serious business losses, at a time they can ill afford to do so. Most ECommerce sites are implemented in Java and causes of outage of course vary widely, but typically usual suspects include (but are not limited to) the following:
  1. Malicious DOS Attacks/ Unintentional but excessive "Spidering"
  2. Poor Capacity Planning at the network tier:
    • Poor or Non-Existent CDN strategy (so no traffic is deflected off Source site)
    • Load Balancer being overwhelmed.
    • Network Bandwidth saturation
  3. Poor Systems infrastructure
  4. Misconfigured HTTP Settings (i.e. the several parameters in httpd.conf, if Apache)
  5. Misconfigured App-Server Settings (i.e. the several parameters in web.xml/server.xml if Tomcat, Connector related settings etc.)
  6. Poor Garbage Collection tuning on JVMs (e.g. large # GCs, long Full GCs).
  7. Database/Application MisConfiguration (e.g. sizing of SGA, DB Connection Pool etc.)
  8. Software Development Issues:
    • Database overwhelmed in terms of Reads/Writes; Quality of PL/SQL Algorithms/Code (e.g. DB Locks, Full table scans etc.)
    • Poor Application Architecture/ Design/ Implementation/ Configuration.

The last one - i.e. poor design and implementation of servlets that constitute the Commerce Site and Backend Systems is certainly something that consumes the bulk of the Development team's effort/time. A common theme here amongst front-end Java Developers is the over-dependence on a already heavily used database - given that it is there and provides persistence. This seriously limits scale, so that during traffic spikes, when the business needs the Systems to be most available (given the number of prospective customers) is exactly when the system blacks-out or browns-out.

Terracotta can help reduce and obviate RDBMS usage in many cases and allow you to safely operate "in-memory", that is durable across JVM life-cycles - so that you get scale and HA while allowing you to maintain a POJO-based programming environment - see http://www.terracotta.org and especially, http://www.terracotta.org/web/display/orgsite/Kill+Your+Database

LET US SEE HOW, Terracotta would add value specifically in E-Commerce Applications. Typically, the E-commerce business involves both FRONT-END (Websites, SOAP services etc.) and BACK-END systems (Supply Chain Planning/Fulfillment/Warehousing) systems. i.e.

A> ORDER ACQUISITION SYSTEMS to enable:







SUB SYSTEMPURPOSECOMMON TERMINOLOGY/ PROBLEMSCOMMON SOLUTIONS/ PROBLEMS with the SOLUTIONSTERRACOTTA VALUE-ADD
ELECTRONIC CATALOG CREATION/ MODIFICATION MERCHANTS decide on what to sell. Procure inventory through BUYERS. Product goes through large workflow(akin to CMS) before ready for publication to catalog (e.g. copy, price, description, photography etc.) as a saleable item. ITEM CREATION/CATALOG MODIFICATION: Typically, each these state changes via this CMS-like Workflow are typically stored in a RDBMS. Employ a home-grown or off-shelf CMS or Document Management System before final publication to the catalog - database.
ELECTRONIC CATALOG PRESENTATION Users browse through a pre-determined product classification hierarchy - Departments/Categories/Sub-Categories/Shelves BROWSE: Not caching Browsing activity leads to RDBMS saturation especially under high volume.Most sites typically report that 90% of all activitiy on the site is BROWSE/SEARCH. CATALOG CACHE - i.e. local Cache on each JVM of Catalog Database queries

Users search for specific products. SEARCH: Not caching Search activity leads to Search Engine saturation especially under high volumeImplement SEARCH CACHE on each local JVM
Inventory position (i.e. Availability Status) being up to date is of utmost importance. If Out-Of-Stock displayed (if inventory currently exists) then lost sales imply opportunity cost. If In-Stock displayed and item out of stock, there is a fulfillment issue INVENTORY CACHE: If not cached, Database saturated. If cached - typically, Inventory cache not up to date vis-à-vis back-end systems and inconsistent across JVMsINVENTORY CACHE is locally cached and when it changes within the database, the change is distributed via JMS. Alternatively, there is no change propagation but one could keep the INVENTORY CACHE TTL very low (e.g. a few minutes). In either case, there is a risk of the CACHE position being incorrectly reflected across JVMs at any point in time.
USER AUTHENTICATION, INTEREST and PURCHASE Mechanisms to identify the user, and allowing the user to express Desire for a product and Enabling the retail Transaction.AUTHENTICATION/ AUTHORIZATION information - needs to be preserved across requests since HTTP is a stateless protocol. Typically state preserved in a HTTP SESSION that is keyed off session-id and session-id is written as a cookie to the user's browser. If HTTP Session is not replicated and persisted elsewhere- losing a JVM would imply a poor user experience, since the user would have to re-authenticate and/or re-establish their position in a workflow.
Allow the user to express interest in a basket of productsSHOPPING CART/ WISH LISTSStored in HTTP Session or Cache keyed off customer-id (if registered customer) / visitor-id (if unregistered customer). If HTTP Session is not replicated and persisted elsewhere- losing a JVM would imply a poor user experience, since the user would lose his/her cart
Enable execution of the buying transaction CHECKOUT - typically implemented as a workflow since it is a multi-stage process across several HTTP Requests involving Credit-Card information, Shipping information etc. Users position in the workflow and pre-validated input is typically stored in the HTTP Session.If HTTP Session is not replicated and persisted elsewhere- losing a JVM would imply a poor user experience, since the user would be thrown out of the checkout process and have to restart all over


B> ORDER FULFILLMENT SYSTEMS:




SUB SYSTEMPURPOSECOMMON TERMINOLOGY/ PROBLEMSCOMMON SOLUTIONS/ PROBLEMS with the SOLUTIONSTERRACOTTA VALUE-ADD
OMS (ORDER MANAGEMENT SYSTEMS) Decomposition of the order into order line-items (e.g. an order may include flowers and basket balls each of which is fulfilled by a different distributor/warehouse) OMS typically requires a fair bit of complex algorithmic knowledge to arrive at the right decomposition of Orders into Order-line items Typically executed as a DBMS job The database is the right place to do this work given the algorithmic complexity and the amount of data to be referenced.
FULFILLMENT SYSTEMS Now that orders have been collected, there is a whole machinery needed to actually deliver the product to the customer. FULFILLMENT SYSTEMS: Must handle B2B work with 3rd party distributors and warehouses along with order line-item lifecycle Typically state-changes to the order-line-item lifecycle are reflected within a Fulfillment and Order Management database, which then results in backlogs from "Order Confirmation" Emails to updates of the status of the order line-item as it proceeds through the warehouse.
CUSTOMER SERVICE Deal with customers who have had issues either during placement of the order or in terms of the order fulfillment i.e. returns, etc. CUSTOMER SERVICE call spike could impact the Customer and Order Database in terms of read/write activity. A complex set of entities (Customers, Order History, Fulfillment status of specific orders etc.) need to be cached. Problems with caching this data per customer is that cache-hit ratio is very low. (i.e. the customer profile is typically needed for just the one contact the customer may have with customer service). One could pre-hydrate certain portions of the cache (e.g. if a spike of calls around a particular product) and/or Distribute a Reference Cache of Products to cut database latency to a certain extent.

IN SUMMARY: You could use Terracotta for:


  1. HTTP SESSION CLUSTERING:
    • To provide high availability to your HTTP SESSION(which likely contains user-authentication information, shopping cart, position within checkout).
    • Terracotta's Session Clustering implementation is the best in the market in terms of minimizing session replication overhead, transparency and visibility.

  2. CACHE DISTRIBUTION:
    • To distribute CATALOG CACHE, SEARCH CACHE, INVENTORY CACHE and elements of the family of Caches needed for CUSTOMER SERVICE.
    • Terracotta supports a variety of distributed caches (from EHCache to Hibernate to POJOs implemented as Hashmaps, CHMs or any arbitrary collection grouping etc.)

  3. INTER-JVM CO-ORDINATION/BATCH PROCESSING
    • To co-ordinate amongst participating JVMS for Customer notification, Work allocation amongst warehouses, Update of Fulfillment life-cycle changes (asyncrhonously drained to the database), etc.
    • Terracotta master-worker framework encapsulates the distribution of work amongst "workers" and aggregation of the output of individual "workers" and exposes some interfaces for you to implement any application specific logic.

  4. In addition, there are several other systems that comprise a E-Commerce business:
    • WAREHOUSE MANAGEMENT SOFTWARE: Planning for work for each shift of the warehouse and Enabling the execution of the Pick, Cartonization, Shipment of the packaged product. (e.g. Retek)
    • WAREHOUSE PLANNING: Linear Programming Exercise to minimize shipping costs given demand forecast and Warehouse Locations and Fulfillment portfolio (e.g. Manugistics, ILog).
    • DEMAND FORECASTING: Forecasting demand based on history and new trends. Grouping the demand temporally and geographically (e.g. Manugistics, i2 etc.)
    • BUSINESS METRICS MANAGEMENT: Recording user click-trail and make marketing/merchandising decisions (e.g. Coremetrics, Omniture etc.) and SEVERAL OTHERS
    • However, these are often procured as shrink-wrapped software from other vendors. So you may have more luck persuading the vendor to partner with Terracotta if scale and HA of the solution are unsatisfactory. Alternatively, if these are home grown Java based systems, you could investigate what data-structures need to be DSO (Distributed Shared Objects), so that they become durable, Since Terracotta is a general purpose platform that clusters at the JVM-Level - it is applicable in any POJO based app. See http://www.terracotta.org/web/display/orgsite/JVM+Level+Clustering

See http://www.terracotta.org and http://www.terracottatech.com for more detail.

About Me

I have spent the last 11+ years of my career either working for a vendor that provides infrastructure software or managing Platform Teams that deal with software-infrastructure concerns at large online-presences (e.g. at Walmart.com and currently at Shutterfly) - am interested in building and analyzing systems that speak to cross-cutting concerns across application development in the enterprise (scalability, availability, security, manageability, latency etc.)