Saturday, October 14, 2017

Case study note for micro service


  • Martin Fowler's definition  for micro service:  small; independently deployable; automated deployment
  • Comare to monolithic application, microservice sacrifies on consistency, and increased deployment complexity to gain a more distributed architecture; with strong separation  and support different styles in each module;  In a word: gaining Partition by sacrificing Consistency;
  • Using REST API,  each micro service can choose its own technical stack, instead of forcing all groups using the same tech stack.  
  • In monolithic application,  separation is done on module level, but sharing the same code repo, the boundaries of modules are blurred, it depends on manually defined standards to ensure separations; but often can't be guaranteed , especially as time goes, the cross boundary codes would become more and more unmanageable ;    In micro service, the separation is done in application layer, the cost for cross-boundary is significantly higher, thus less likely to happen, and ensure the separations;
  • In a distributed architecture, can't use DB to ensure event consistency, but instead need using message middleware or some event driven mechanism to ensure eventual consistency; 
  • As more and more apps deployed in micro-service, one release can contain multiple apps, and each app may have different tech stack,  it has higher requirement for DevOps and automation, the boundary of development and operation is further blurred;
  • One strategy to make the transition from monolithic to micro-service is to take baby steps:
    • minimize the intrusion into existing application;
    • using main stream micro-service architecture;
    • focusing on those services must need the micro-service transition
  • For example, this company pick 3 modules for such strategy: 
    • Registry Center:  using consul cluster for all service registration, then using Consul Template to refresh Nginx configurations to implement load-balance;
    • Configuration Center: Using self-developed "Matrix" system to refresh configurations through customized plugins, thus minimize intrusion into existing applications;
    • Authorization Center: Based on Spring Security OAuth, and support SSO (based on WeChat Enterprise Account)
  • Based on Spring Boot,  compare to traditional WebApplication based on TomCat/JBoss/WebLogic which is using XML  ,App Containers and .war package,  Sprint boot application can be run using "java -jar ...",  when Sprint Boot starts, it has a built-in Web container (such as with TomCat inside),  then load all configurations, class, Beans in the application.  It wraps web container inside the apps, thus Spring  Boot is suitable for micro-service development;
  • Compare to Dubbo and Spring Cloud:
    • Dubbo is open-source project from Alibaba, started in 2011, using EJB standards to develop  Spring Bean, it can be done but very hard to upgrade and maintain, the last upgrade is still 10/2014, the supported Spring version is 2.5.6.SEC03, but now Spring 5 is about to release;
    • Spring Cloud  is the best micro-service architecture in JAVA community, it also uses Spring boot as the foundation, if following their quick start guide, you can deploy a micro-service demo in no time, however it is more suitable if you are re-wirting you applications or start new projects, not for transition your existing applications into micro-service architecture, because it is very hard to integrate;
  • Registry Center:
    • It is one of the most important component in micro-service architecture;  the nature is to de-couple the service providers with the service consumers
    • For any micro service, it should have more than one service providers due to the distributed architecture; what is more, to support elastic competing, the number of the service providers and their distributions are dynamic, and can't be pre-decided;  Thus the static load-balance method is not suitable anymore; it needs a separate component to manage the service discovery and service registration, this is the task for Registry center;
  • There are 3 examples of the architecture  for a Registry Center:
    • Inside application:  Such as Netflix Eureka, it integrates the service discovery and registration inside the application, an app can do this task in itself; Another example is to use ZooKeeper or Etcd to implement a mechanism for service registry, often seen in big enterprise environment;
    • Outside application:  Such as Airbnb SmartStack, it treats applications as blackbox, it uses some method outside application to register a service in Registry;
    • DNS based:  Such as SkyDNS, using SRV record in DNS, it can also be deemed as  one of the "Outside application" category; Due to the DNS cache limitation, not suitable for complex micro-service Registry;
  • From DevOps perspective, there are 5 points need be considered:
    • Liveness check:  after service registration, how to test service liveness to ensure service usability;
    • Load-balance:  when multiple service providers exist, how to load balance among the service providers;
    • Integration:  how to integrate into registration center on service provider end and on API calling end;
    • Dependency-to-Environment:  when introduce registration center, what impacts it has to the running environment;
    • Availability:  how to ensure high availability of the registration center itself, especially on how to get rid of single point of failure;
  • Some comparison on Eureka, SmartStack and Consul:
    • Eureka case study: There are clear boundary definitions for provider, consumer and registration center, it gets rid of central bus to make sure high availability of distributed cluster;  but the cons are it intrudes inside applications, and only supports Java applications;
    • SmartStack case study:  it is the most complicated architect among the three, it uses ZooKeeper, HAProxy, Nerve and Synapse, thus has high requirements for DevOps, since it is outside application, thus adaptable to any applications;
    • Consul is outside application registration in nature, yet you can integrate with Consul SKD and local Agent to simplify the registration process;When service is running inside container, it can use Registration to implement self-registration;  Service discovery on provider side replies on SDK, but you can also use Consul Template to eliminate such SDK dependencies; 
  • Case study:  picked Consul as the architect for Registration Center, two major considerations:
    • Minimize the intrusion into existing applications;  this is one of the key principle strategy in this project;
    • Reduce complexities,  by using Registration and Consul Template to implement service self-discovery and self-registration
  • Configuration Center:  From PaaS to a small caching architecture, all need depend on specific configurations to provide services, the same is true for micro-service;
  • Some methods to categorize configurations:
    • By configuration sources:  such as by source-code, by files, by databases or by remote API calls etc;
    • By running environments:  such as dev env, test env, pre-release env, and production env etc;
    • By integration phases:  such as by compilation time, packing time or running time; 
      • By compilation time:  such as inside source code configurations,  or commit both the configuration files and source code into code repo; 
      • By packaging time:  package the configuration time into application package; 
      • By running time:  don't know the specific configuration until after app starts,  get configurations from local or remote source, then start the application
    • By load methods:  such as only load once or dynamic loading configurations; 
  • At the beginning, configuration and source code are put together in git repo, then for security concerns,  configuration data is separated and put on CI servers or using packaging tools to integrate into app packages, or direct put under specific folders on servers; as complexity increases,  above methods couldn't keep up, thus need dedicated configuration center;
  • To pick an Arch types, need consider following points:
    • Security, config data shouldn't be in source code, thus ensure security under prod;
    • environment need be separate,  such as dev/test/qa/prod, config data need be separated;
    • If under the same environment, all config data need be consistent for all services;
    • Need remote control config data in distributed environment, basically manageable;
  • Some examples:
    • Spring Cloud Config: specific for Spring, env is mapped to profile, app version maps to label, on server side, using git/Filesystem/Vault to store and manage config data, on client side, using Spring config , load config at run time;
    • Disconf:   developed by Baidu engineer, on server side, has GUI interface to manage all running env, app and config files; on client side, using Spring through Spring AOP to auto load and refresh config data
    • In this case, since client side requires Spring, using self-developed "Matrix", which adaptable for non-Spring services, it has 3 special capabilities:
      • Separate config files and config items, for config files, using plug-in (SBT, Maven, Gradle), and through packing tools to minimize intrusion to CI system;  For config items, provide SDK, retrieve item from server and provide caching mechanism;
      • Increase dimensions of an app, thus for the sam app can be deployed to different server version or app version and still managing app config data;
      • Config data is version controlled, like Git, can roll back to previous history version
  • Authentication Center, it has authentication and authorization; 3 common architectures:
    • Simple Auth:   Not provided by service, but rather using environment to auth, such as IP white list, domain name etc;   Used in simple apps;
    • Protocol Auth:  Service provide Key pairs,  to call the service client will use the pub key to generate auth header (with ID of the calling client), service will authenticate on such headers; Used in C/S type applications; (such Amazon S3)
    • Central Auth:  with the introduction of Authentication Center,  when client calling for a service, need ask an authorization token  from Auth Center, then submit request together with the token, when service side receive such request, it first ask Auth center to retrieve the ID of the client using the token provided, then decide if it can authorize access; It separate the Auth function from service, and it can persist while app is going through upgrade,  Auth center just need refresh Auth rules to adapt;
  • About OAuth:   
    • A: Authorization Request: client -> resource owner
    • B: Authorization Grant:  resource owner -> client
    • C: Authorization Grant:  client -> Authorization Server
    • D: Access Token:   Authorization Server -> Client
    • E: Access Token:   Client -> Resource Server
    • F:  Protected Resources:  Resource Server -> client
    • In Micro-service,  service provider is the combination of resource owner and authorization server
    • OAuth 2.0 has Beared Token,  client store IDs into Access Token, resource server can authorize the Access token, thus reducing the steps of A-F to 4 steps; 
    • Two main format of Beared Token:  SAML and JWT:  SAML is based on XML, JWT is based on JSON;  in Micro-service, since JSON is widely used, thus JWT is more common;
  • Some Oath arch:
    • CAS
    • Apache Oltu
    • Spring Security OAuth
    • OAuth-Apis
    • CAS and Spring Security OAuth is better on integration cost and extensibility; 
    • In this case, it uses Spring Security OAuth,  using private cert, Scope authorization and domain checks