Cloud Pub/Sub

         Have you ever had to process a large number of requests but were unable to do so because our computer's processing speed was insufficient to handle the load?

It's a terrible condition since eventually our service will stall or requests will start being timed out. Exactly this kind of circumstance calls for the employment of a tool called a message queue.

By simply adding the incoming request from the user to a queue, we can use the message queue. Then, our service can take these requests one at a time from the queue and handle them as it sees fit.


What is Cloud Pub/Sub?



As the name pub/sub indicates, cloud pub/sub supports a publisher-subscriber model. The objective of cloud pub/sub is to deliver reliable asynchronous messaging between applications. We can send and receive messages between separate apps using Cloud Pub/Sub which is a fully-managed real-time messaging service for event driven systems.


Core Concepts

  • Topic: To which publisher sends the messages.

  • Subscription: We must subscribe to the topic in order to receive the data published to it. The streaming message from a particular topic that will be sent to the subscribing application is represented by a subscription.

  • Message: Data to be published and subscribed.

  • Message attribute: The publisher can specify a key-value pair for the message depending on the language of the receiver.

 

How Pub/Sub works?

A publisher application creates and sends messages to a topic which is a named resource and the messages are stored until acknowledged by all subscribers. To receive these messages, a subscriber application creates a subscription to a topic. The subscriber receives a message by either cloud pub/sub pushing them to the subscriber’s chosen endpoint or by subscriber pulling them from the service. When the message is acknowledged by the subscriber, it is removed from the subscription backlog and not delivered again. The message itself can contain the payload and optional attributes that describe the payload content. 

Publisher-Subscriber Relationships

 

   One-to-many                    Many-to-one                     Many-to-many


Communication can be one-to-many also called fan-out, many-to-one or fan-in, and many-to-many, which means if subscriber x wants to receive messages from two different publishers, a and b, it can be achieved by creating two subscriptions corresponding to the topics for each publisher, which is a many-to-one model. Similarly, if two different applications want to subscribe to the same message from a publisher, then it can be achieved by creating two different subscriptions from the same topic, which is a one-to-many model.

Google Cloud Pub/Sub

Google Cloud Pub/Sub is a service that offers scalable, reliable, and secure communications capabilities to link different cloud-based applications together. Cloud Pub/Sub enables many-to-many structured asynchronous information transfers both inside and outside of the Google Cloud Platform. The system is currently flexible and adaptable to practically any circumstance because of its relatively simple interface, which is operable utilizing HTTP and JSON through its API. We can use the client libraries offered by Google to make access to cloud pub/sub easier, much like with other APIs.

Features

  • Able to perform global message routing to make multi-regional systems simpler.

  • For publishers and subscribers, we can set up distinct quotas and billing.

  • At-least-once delivery is guaranteed at any scale through synchronous, cross-zone message replication and per-message receipt tracking. Each message is delivered at least once using Pub/Sub, therefore messages may be resent.

  • Cloud Pub/Sub does not use partitions or shards. Just set our quota, publish, and start consuming.

Types of Pub/Sub services

In Pub/Sub, there are two services:

  • Pub/Sub service:  For most users and apps, this messaging service is the default option. Along with automatic capacity management, it provides the greatest level of reliability and the broadest range of integrations. Pub/Sub ensures that all data is replicated synchronously to at least two zones and that a third zone will receive best-effort replication.

  • Pub/Sub Lite service: A different but comparable messaging service created at a lesser cost. Compared to Pub/Sub, it provides less reliability. It provides both regional and zonal topic storage. Topics for Zonal Lite are only saved in one zone. Regional Lite topics asynchronously replicate data to a second zone. We must also pre-provision and manage storage and throughput capacity while using Pub/Sub Lite. Only apply to Pub/Sub Lite when attaining a low-cost warrant some more operational work and decreased reliability.


Common Use Cases


  • Real-time event distribution: Events can be distributed in real time to various applications within our team and company, whether they are raw or processed. The "enterprise event bus" and event-driven application architecture patterns are supported by Pub/Sub. Many Google systems that export events to Pub/Sub can be integrated via Pub/Sub.

  • Ingestion user interaction and server events: We can transmit user interaction events from end-user apps or server events from our system to Pub/Sub for consumption. After that, we can utilise a stream processing tool that sends the events to databases, like Dataflow. These databases include Cloud Storage, Cloud Bigtable, and BigQuery, as examples. We can simultaneously collect events from numerous clients using Pub/Sub.

  • Parallel processing and workflows. parallel workflows and processing. By connecting to Cloud Functions using Pub/Sub messages, we may easily divide a variety of tasks across numerous employees. Sending email notifications, reformatting photographs, testing AI models, and compression of text files, are a few examples of such jobs.

  • Enterprise event bus: We can build a real-time enterprise data sharing bus to distribute business events, database updates, and analytics events throughout our organization.

  • Data replication among databases: Database replication: The Pub/Sub protocol is frequently used to distribute change events from databases. In BigQuery and other data storage systems, a view of the database state and state history may be created using these events.

  • Load balancing for reliability. For example, instances of a service could be set up on Compute Engine in different zones but subscribe to the same topic. The other zones can automatically take over the load if a zone's service fails.

  • Data streaming from applications, services, or IoT devices.  For instance, a SaaS application may broadcast an event feed that is updated in real-time. Alternatively, a home sensor can use a Dataflow pipeline to transmit data to Pub/Sub for usage in other Google Cloud applications.

  • Refreshing distributed caches: Distributed caches can be refreshed, for instance, by publishing invalidation events that update the IDs of modified objects.

 

 

Advantages of Pub/Sub

 

  • Simplify Communication
    Some of the most difficult code to develop is integration and communications code. By replacing all point-to-point connections with a single connection to a messaging topic, which will handle subscriptions to determine which messages should be sent to which endpoints, the publish subscribe paradigm minimizes complexity. Less callbacks lead to looser coupling and simpler to modify and maintain programs.

  • Eliminate Polling
    Developers that create real-time event-reliant apps benefit greatly from pub/sub messaging. Immediate, push-based delivery made possible by message topics does not require message recipients to recheck or "poll" for fresh data and changes on a regular basis. In systems where delays are intolerable, this encourages quicker response times and lowers delivery latency, which can be problematic.

  • Decouple and Scale Independently
    Software is made more versatile via pub/sub. We can grow and scale publishers and subscribers individually because they are disconnected from one another and operate independently. We can opt to manage requests one way this month and another way the following month. Because Pub/Sub enables we to be flexible in how everything interacts with everything else, adding or replacing functionality won't have an impact on the entire system.

  • Dynamic Targeting
    Service discovery is made simpler, more intuitive, and less mistake prone using Pub/Sub. A publisher will merely send messages to a subject rather than keeping a list of peers to which an application might send messages. Any interested party can then begin receiving these messages by subscribing their endpoint to the topic. Customers can switch, update, expand, or even disappear, and the system adapts on the go.

 

Disadvantages of pub/sub pattern


Although Pub/Sub is a powerful messaging service, it is not the ideal choice for many situations. Let's now quickly review some of this pattern's drawbacks.


  • It's important to maintain and implement pub/sub correctly. Implementing Pub/Sub in situations where scalability and a decoupled nature are non-essential components of our application will be a resource wastage and add unneeded complexity for smaller systems.

  • When working with media like music or video, pub/sub is inappropriate since these types of files require seamless synchronized streaming between the host and the receiver. Pub/sub messaging is not ideal for the following since it does not provide synchronous end-to-end communications.


  • Testing may be challenging. As interactions are not synchronous, testing does not need issuing a request and then examining the outcome. Rather, a message must be entered into the system, and the test must then watch the behaviour of the process under test to determine when and how the message is handled. Furthermore, if the process under test necessitates ingesting a large number of messages from a certain subject over time, the testing regimens might become more complex to handle.

  • There is no way to know whether a delivery was unsuccessful or successful because the middleman might not inform the system of the message delivery status. In order to ensure this, tighter coupling is required.

  • The system can be breached and invaded by malevolent publishers, which can result in the publication of undesirable messages and the availability of communications that subscribers shouldn't ordinarily receive.

  • Message formatting and exchange require a well-defined policy or else, message consumption might become corrupted and more prone to error.


Comparison between two famous messaging services - Pub/Sub and Apache Kafka


As we know, with the help of the asynchronous messaging service pub/sub, messages can be distributed asynchronously throughout several application sections. Now to compare pub/sub and Kafka we need to understand about Kafka. Apache Kafka is a distributed and open source platform which enables applications to publish, subscribe, store, and process events stream. In addition, pub/sub is a message queue while Kafka is an Event Streaming Platform.


Each M in the diagram above stands for a message. The horizontal rows of messages are used to represent the many ordered partitions of messages that Kafka brokers maintain. Customers access messages from a specific partition, which has a capacity determined by the machine hosting that partition. In Pub/Sub, there are no divisions; instead, readers read from a topic that automatically scales in response to demand. To manage the anticipated consumer traffic, we configure each Kafka topic with the appropriate number of partitions. Pub/Sub automatically scales based on demand.


When compared on the basis of availability then kafka is manually deployed to additional locations whereas in case of pub/ sub, they are deployed in all Google Cloud regions for high availability and low latency. When talking in terms of Capacity planning, Kafka manually plans storage and compute needs in advance while pub/sub’s capacity is planned and managed by Google. Disaster recovery of pub/sub is managed by Google and in comparison Kafka designs and maintains its own backup and replication. Google manages the Infrastructure of pub/sub. Whereas,  Kafka manually deploys and operates virtual machines (VMs) or machines. Consistent versioning and patches must be maintained by us. Pub/sub provides 24-hour on-call staff and support while Kafka does not deliver any support. Messages on pub/sub can be stored up to 7 days. But in the case of Kafka, message storage is limited only by available machine storage. Kafka’s logging and monitoring is Self-managed while pub/ sub’s is automated with Cloud Logging and Cloud Monitoring. 


Conclusion

The Pub/Sub pattern is the basis of asynchronous architectural design. The Pub-Sub pattern is the focus of numerous communications programs and cloud services. It might be challenging to use Pub-Sub, especially when several themes are at play. Understanding the fundamental ideas is simple, but putting them into practice can be difficult based on the messaging system being used and the volume of asynchronous action that needs to be supported. However, the Pub-Sub paradigm is quite common when it comes to handling asynchronous communication between components and services in a big distributed application.


Image Sources:

1)https://www.youtube.com/watch?time_continue=79&v=MjEam95VLiI&feature=emb_title

2) https://cloud.google.com/pubsub/docs/overview

3) https://cloud.google.com/architecture/migrating-from-kafka-to-pubsub


Comments

Popular posts from this blog

JDBC

Investment Decisions