Problem
Running and testing services in Lagom development environment, with its support for all needed tooling, is a smooth and straight forward process. Everything is prepared and ready to use out-of-the-box.
Deploying and running it in production requires complete environment setup and it is not as straight forward at first (at least it was not for me :)).
Lagom (>1.5.1) comes with out-of-the box support for running on kubernetes.
Kubernetes cluster setup depends on the chosen kubernetes implementation and it is out of the scope of this blog. Personally, I’m using Amazon EKS.
So question is, when you have kubernetes cluster running, what else is required to deploy and run your Lagom microservice system.
Solution
Kubernetes basics
I will assume you have a basic knowledge of kubernetes and if not I strongly recommend going through kubernetes official documentation with focus on:
Kubernetes cluster management access
Kubernetes cluster is managed via kubectl CLI tool that needs to be preconfigured to access certain kubernetes cluster.
When you have multiple kubernetes clusters running (test, production #1, production #2,….) switching between kubectl configurations is prone to errors, ending up in performing operations on the wrong cluster. To avoid it I tend to use a bastion host based solution.
Depending on the kubernetes cluster network location you could:
- dedicate bastion host per kubernetes cluster
- dedicate bastion host OS user per kubernetes cluster
Kubernetes namespace organization
Kubernetes uses namespaces to support multiple virtual clusters on one physical cluster. It can also be used for multi-tenant deployments but I like to avoid it to keep the setup as simple as possible.
By default, Kubernetes comes with 3 preconfigured namespaces: default, kube-public, kube-system.
I use kube-system for deploying and running kubernetes resources unrelated to my Lagom system.
For Lagom system you could use default but I like to create a separate namespace for grouping my Lagom system services in one logical group.
Example namespace resource configuration:
apiVersion: v1 kind: Namespace metadata: name: lagom labels: name: lagom
Helm
Helm is kubernetes package manager. I see it as kubernetes APT based tool.
My personal Helm benefits are:
- using different Helm repositories to get access to official and community created kubernetes tools (you will see in later which are those) in purpose of simplifying its configuration and deployment
- template Lagom kubernetes resources to simplify its configuration and deployment when having high count of services
Lagom kubernetes support
Deploying Lagom on kubernetes requires manual creation of kubernetes resources (deployment, service, ingress).
For kubernetes resource maintenance optimization I would recommend creating your own Helm Chart that would be reused (by supplying service specific values.yaml) for all your Lagom services.
Example of deployment.yaml:
apiVersion: apps/v1beta2 kind: Deployment metadata: name: "account-v1-0-0" labels: app: "account" appVersion: "account-v1-0-0" namespace: lagom spec: replicas: {initialNumberOfReplicas} selector: matchLabels: appVersion: "account-v1-0-0" template: metadata: labels: app: "account" appVersion: "account-v1-0-0" spec: restartPolicy: Always containers: - name: "account" image: "{dockerRepositoryUrl}/account_impl:1.0.0" imagePullPolicy: Always env: - name: "JAVA_OPTS" value: " -Dconfig.resource=production.conf -Dplay.http.secret.key={playSecret} -Dlogger.resource=logback-prod.xml -Dplay.server.pidfile.path=/dev/null -Dlagom.akka.discovery.service-name-mappings.elastic-search.lookup= _http._tcp.elasticsearch.lagom.svc.cluster.local -Dlagom.akka.discovery.service-name-mappings.cas_native.lookup= _cql._tcp.cassandra.lagom.svc.cluster.local -Dlagom.akka.discovery.service-name-mappings.kafka_native.lookup= _broker._tcp.kafka.lagom.svc.cluster.local " - name: "REQUIRED_CONTACT_POINT_NR" value: "{initialNumberOfReplicas}" - name: "SERVICE_NAMESPACE" value: "lagom" - name: "SERVICE_NAME" value: "account" - name: "KUBERNETES_POD_IP" valueFrom: fieldRef: fieldPath: "status.podIP" ports: - containerPort: 9000 name: http - containerPort: 2552 name: remoting - containerPort: 8558 name: management readinessProbe: httpGet: path: "/ready" port: "management" periodSeconds: 10 initialDelaySeconds: 20 failureThreshold: 4 livenessProbe: httpGet: path: "/alive" port: "management" periodSeconds: 10 initialDelaySeconds: 60 failureThreshold: 2 resources: requests: cpu: 0.5 memory: "512Mi"
Note: JAVA_OPTS have been formatted for better visibility, so in case of copy/paste format it in one line.
Example of service.yaml:
apiVersion: v1 kind: Service metadata: labels: app: "account" name: "account" namespace: "lagom" spec: ports: - name: http port: 9000 protocol: TCP targetPort: 9000 - name: remoting port: 2552 protocol: TCP targetPort: 2552 - name: management port: 8558 protocol: TCP targetPort: 8558 selector: app: "account"
Example of ingress.yaml:
apiVersion: "extensions/v1beta1" kind: Ingress metadata: name: "account-internal-ingress" annotations: kubernetes.io/ingress.class: "nginx-internal" nginx.ingress.kubernetes.io/ssl-redirect: "false" ingress.kubernetes.io/ssl-redirect: "false" namespace: "msrp" spec: rules: - http: paths: - path: "/api/account" backend: serviceName: "account" servicePort: 9000 --- apiVersion: "extensions/v1beta1" kind: Ingress metadata: name: "account-external-ingress" annotations: kubernetes.io/ingress.class: "nginx-external" nginx.ingress.kubernetes.io/ssl-redirect: "false" ingress.kubernetes.io/ssl-redirect: "false" namespace: "msrp" spec: rules: - http: paths: - path: "/api/external/account" backend: serviceName: "account" servicePort: 9000
Example of production.conf (Java):
include "application.conf" lagom.cluster.exit-jvm-when-system-terminated = on akka { actor { provider = cluster } cluster { shutdown-after-unsuccessful-join-seed-nodes = 60s } discovery { method = akka-dns kubernetes-api { pod-namespace = ${SERVICE_NAMESPACE} pod-label-seleactor = "app=%s" pod-port-name = management } } management { cluster { bootstrap { contact-point-discovery { discovery-method = kubernetes-api service-name = ${SERVICE_NAME} required-contact-point-nr = ${REQUIRED_CONTACT_POINT_NR} protocol = "tcp" kubernetes-api { pod-namespace = ${SERVICE_NAMESPACE} pod-port-name = management pod-label-selector = "app=%s" } } } } http { port = 8558 bind-hostname = ${KUBERNETES_POD_IP} } } }
Check official doc for more details: Running Lagom in production
I also recommend checking blog CPU considerations for Java applications running in Docker and Kubernetes.
Lagom service call access control
Service API calls can be categorized, depending on the access control requirement, in:
- internal calls
- external calls
Internal calls are accessed internally by the trusted callers (service to service running in same Kubernetes cluster) and by that does not require any access control. Communication is not required to be encrypted and caller does not require authentication (simple caller identification could be used if required).
In kubernetes cluster, service to service communication is done by directly accessing service POD IPs and port. When one service wants to connect to another, caller service uses Lagom ServiceLocator to locate, via kubernetes API/Akka DNS, called service POD IPs and port. Lagom services are located based on the service name specified in API descriptor. Service name is deployed as kubernetes service resource.
named("account-service")
Kubernetes resource names are restricted and by that service names are restricted. So it is important to follow the rules of these restrictions when defining service name in API descriptor!
External calls are accessed externally (from Internet) by the un-trusted callers and should require communication encryption and caller authentication.
In kubernetes, service external access is configured using ingress resource.
One Lagom service can have one or more ingress resources deployed, configuring what ACL are used.
In order for the ingress resource to work, the cluster must have an ingress controller deployed and running.
Mostly used ingress controller implementation is NGINX ingress controller. It is mostly used because it is supported and maintained by the kubernetes projects itself and can be deployed on almost all kubernetes implementations.
NGINX ingress controller enables you to manage the entire lifecycle of NGINX by subscribing to ingress resource events (ADD/REMOVE), via Kubernetes API, based on which NGINX location configuration is updated automatically, in runtime.
NGIX ingress controller can be deployed using NGINX Ingress controller HELM chart. Be sure to specify namespace (kube-system) and ingress controller name:
--namespace kube-system --set controller.ingressClass=nginx-external
Ingress resource can be configured to target specific ingress controller, if multiple controllers are running, by specifying ingress controller name in ingress resource annotation:
kubernetes.io/ingress.class: nginx-external
You can check annotation usage in Lagom kubernetes support section ingress.yaml example.
So ingress resource is used to expose service external access. If service requires both internal and external access it is required to differentiate it.
This can be done by specifying different URL based contexts.
For example:
/api/accounts # with internal context /api/external/accounts # with external context
For /api/external/accounts it is required to deploy ingress resource to allow external access and for /api/accounts ingress resource is NOT required because it is used as internal access.
Kubernetes SSL/TLS encryption support
In kubernetes, SSL/TLS configuration is configured by using specific annotations in ingress resource. Ingress controller, based on this configuration, configures and implements SSL/TLS termination.
By this definition we need to apply SSL/TLS configuration for every external accessible service Ingress resource that will not be convenient to maintain. In most use cases the same SSL/TLS configuration (single SSL/TLS termination point) is used for accessing all external accessible services.
To resolve this we could use these two solutions (that I’m aware of):
- Use and configure NGINX controller with Default SSL configuration
- Deploy additional Ingress controller dedicated for SSL/TLS termination
Solution #1 is explained in referenced documentation.
Solution #2 is to deploy one extra Ingress controller dedicated for SSL/TLS termination and deploy “singleton” Ingress resource that will configure SSL/TLS for SSL/TLS termination and forward all traffic to already created NGINX controller. Singleton in this context means that only one ingress resource is deployed.
With this solution we “extracted” SSL/TLS termination point from already created Ingress controller and by that avoided configuring SSL/TLS configuration per service ingress resource.
For SSL/TLS dedicated ingress controller we could use:
- NGINX Ingress controller
- depending on the kubernetes implementation used, cloud provider specific Ingress controller. I use Amazon EKS ALB Ingress controller that leverages Amazon Application Loadbalancer.
If your kubernetes implementation allows only NGINX Ingress controller, solution #2, in general, does not make sense and I would suggest going with solution #1.
In case of using cloud provider, cloud provider specific Ingress controller brings advantage of securing external access outside of your kubernetes environment. In opposite of NGINX Ingress controller running inside of kubernetes cluster.
Example deploying Amazon EKS ALB Ingress controller using aws-alb-ingress-controller helm chart
helm install incubator/aws-alb-ingress-controller --name=external-alb-ingress-controller --namespace kube-system --set autoDiscoverAwsRegion=true --set autoDiscoverAwsVpcID=true --set clusterName=myK8s
AWS ALB singleton ingress resource:
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: "ssl-alb" namespace: kube-system labels: app: "sslAlb" annotations: kubernetes.io/ingress.class: "alb" alb.ingress.kubernetes.io/scheme: "internet-facing" alb.ingress.kubernetes.io/target-type: "instance" alb.ingress.kubernetes.io/security-groups:{mySecurityGroupIds}, ... alb.ingress.kubernetes.io/subnets: {myVPCSubnetIds}, ... alb.ingress.kubernetes.io/certificate-arn: {mYAcmCertificateArn} alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80,"HTTPS": 443}]' alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}' alb.ingress.kubernetes.io/healthcheck-path: "/" alb.ingress.kubernetes.io/success-codes: "200,404" spec: rules: - http: paths: - path: /* backend: serviceName: ssl-redirect servicePort: use-annotation - path: /* backend: serviceName: "external-nginx-ingress-controller-controller" servicePort: 80
Solution #1:
Solution #2:
External access authentication
For authentication different methods are applicable (HTTP auth, JWT, mutual SSL,..).
Lagom access to “external services” (Cassandra and Kafka)
Lagom services require access to Cassandra (or any other journal store) and, depending on the service use case, optionally access to Kafka. In Lagom this falls in category of “external services”.
Cassandra and Kafka can be deployed, depending on the preferences:
- in kubernetes cluster
- on dedicated hosts
- SAAS
I personally use dedicated hosts for these reasons:
- prior running Lagom system on kubernetes I was running it on Lightbend ConductR where recommend Cassandra and Kafka deployment was on dedicated hosts. When migrating Lagom system to kubernetes it was not possible to migrate Cassandra and Kafka because of to long required downtime
- when starting with Lagom there were not so many SAAS options available
Lagom uses kubernetes DNS SRV method for allocating external service endpoints.
DNS SRV records, in kubernetes, are generated from kubernetes service resource. In sense of external services, kubernetes service resource is abstracting external service access and by that its deployment type.
If external services are deployed in kubernetes cluster, Cassandra and/or Kafka kubernetes service resource will be deployed and by that DNS SRV records will be generated.
If external services are deployed outside of kubernetes cluster (dedicated hosts or SAAS), kubernetes Headless service can be used to configured it. Headless service, as “regular” kubernates service resource, generates DNS SRV records.
DNS SRV record needs to be configured using configuration parameter:
lagom.akka.discovery.service-name-mappings.{externalServiceName}.lookup
Example of cassandra headless service resource:
apiVersion: v1 kind: Service metadata: name: cassandra namespace: lagom spec: ports: - name: "cql" protocol: "TCP" port: 9042 targetPort: 9042 nodePort: 0 --- apiVersion: v1 kind: Endpoints metadata: name: cassandra namespace: lagom subsets: - addresses: - ip: 10.0.1.85 - ip: 10.0.2.57 - ip: 10.0.3.106 ports: - name: "cql" port: 9042
DNS SRV record example:
_cql._tcp.cassandra.lagom.svc.cluster.local
Lagom external service DNS SRV name setup configuration (deployment.yaml ENV JAVA_OPTS parameter):
-Dlagom.akka.discovery.service-name-mappings.cas_native.lookup=_cql._tcp.cassandra.lagom.svc.cluster.local
Example of kafka headless service resource:
apiVersion: v1 kind: Service metadata: name: kafka namespace: lagom spec: ports: - name: "broker" protocol: "TCP" port: 9092 targetPort: 9092 nodePort: 0 --- apiVersion: v1 kind: Endpoints metadata: name: kafka namespace: lagom subsets: - addresses: - ip: 10.0.1.85 - ip: 10.0.2.57 - ip: 10.0.3.106 ports: - name: "broker" port: 9092
DNS SRV record example:
_broker._tcp.kafka.lagom.svc.cluster.local
Lagom external service DNS SRV name setup configuration (deployment.yaml ENV JAVA_OPTS parameter):
-Dlagom.akka.discovery.service-name-mappings.kafka_native.lookup=_broker._tcp.kafka.lagom.svc.cluster.local
Hi! Does this setting -Dlagom.akka.discovery.service-name-mappings.cas_native.lookup=_cql._tcp.cassandra.lagom.svc.cluster.local require some additional configuration in application.conf and/or Module ? It seems without it the default “lookup” skips this setting, i.e. uses cassandra.default.contact-points which is ‘127.0.0.1’ by default. It is described here https://www.lagomframework.com/documentation/1.5.x/java/ProductionOverview.html#Using-static-Cassandra-contact-points
Hi,
This setting is used in akka-service-locator Lagom module:
https://github.com/lagom/lagom/blob/628805dc6da419d0866ee9d1581838d5cf8f3157/akka-service-locator/core/src/main/scala/com/lightbend/lagom/internal/client/ServiceNameMapper.scala#L46
You need to have AkkaDiscoveryServiceLocatorModule Play module enabled in your conf:
play.modules.enabled += “com.lightbend.lagom.javadsl.akka.discovery.AkkaDiscoveryServiceLocatorModule”
Check comment from Renato for 1.5.1: https://discuss.lightbend.com/t/lagom-1-4-12-and-1-5-1-releases/4105
Hope this helps.
BR,
Alan
Ok. I found only that is said that AkkaDiscoveryServiceLocatorModule is added by default to your project and will be bind only in production mode 🙂 And this configuration play.modules.enabled += com.lightbend.lagom.javadsl.akka.discovery.AkkaDiscoveryServiceLocatorModule is done already in reference.conf. So if I understand it correctly nothing else should be done. But in my case the way to lookup Cassandra/Kafka via lagom.akka.discovery.service-name-mappings..lookup worked only for Kafka and Cassandra fell back to cassandra.default.contact-points. Of course it is possible that I’ve done mistake somewere else 🙁
Anyway, thanks a lot! Nice article
I do not think it should fallback to default contact point configuration if ServiceLocator is configured correctly. I would assume AkkaDiscoveryServiceLocator is not used.
Can you check, in runtime what implementation of ServiceLocator interface is used (You can inject interface and print out implementation)?
My resulting application.conf contains
akka.discovery.kubernetes-api.pod-domain = “k8s.test”
play.modules.enabled += com.lightbend.lagom.javadsl.akka.discovery.AkkaDiscoveryServiceLocatorModule
cassandra.default.contact-points = [“127.0.0.1”] # It is default
lagom.akka.discovery.service-name-mappings.cas_native.lookup=_cql._tcp.reactive-sandbox-test-reactive-sandbox-cassandra.test # test is namespace of the k8s.
Ping by SRV name is successfull
# ping _cql._tcp.reactive-sandbox-test-reactive-sandbox-cassandra.test
PING _cql._tcp.reactive-sandbox-test-reactive-sandbox-cassandra.test (10.233.108.161): 56 data bytes
64 bytes from 10.233.108.161: seq=0 ttl=64 time=0.033 ms
64 bytes from 10.233.108.161: seq=1 ttl=64 time=0.096 ms
64 bytes from 10.233.108.161: seq=2 ttl=64 time=0.053 ms
^C
Log output
2019-06-07T17:02:04.843+0300 [error] akka.actor.OneForOneStrategy [sourceThread=delivery-service-akka.actor.default-dispatcher-15, akkaTimestamp=14:02:04.842UTC, akkaSource=akka://delivery-service/user/cassandraOffsetStorePrepare-singleton/singleton/cassandraOffsetStorePrepare, sourceActorSystem=delivery-service] – All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.TransportException: [/127.0.0.1:9042] Cannot connect))
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.TransportException: [/127.0.0.1:9042] Cannot connect))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:268)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:107)
at com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1652)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1571)
at com.datastax.driver.core.Cluster.init(Cluster.java:208)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:376)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:355)
at akka.persistence.cassandra.ConfigSessionProvider.$anonfun$connect$1(ConfigSessionProvider.scala:48)
at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:307)
at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
If you check here: https://github.com/lagom/lagom/blob/628805dc6da419d0866ee9d1581838d5cf8f3157/persistence-cassandra/core/src/main/resources/reference.conf#L25
default cassandra session provider is com.lightbend.lagom.internal.persistence.cassandra.ServiceLocatorSessionProvider.
Only if you explicitly set it to: akka.persistence.cassandra.ConfigSessionProvider you will use static config.
So by this there is no contact point fallback but one or another.
Did you maybe set explicitly akka.persistence.cassandra.ConfigSessionProvider? If so remove it.
No.
Awesome post! Keep up the great work! 🙂
Great content! Super high-quality! Keep it up! 🙂
Great content! You made my day! Thank you.