BigBen - 一种通用的，多租户，基于时间的事件调度程序和cron调度框架（bigben烟斗）-FinClip官网

BigBen - 一种通用的，多租户，基于时间的事件调度程序和cron调度框架（bigben烟斗）

网友投稿 661 2022-10-12

BigBen - 一种通用的，多租户，基于时间的事件调度程序和cron调度框架（bigben烟斗）

BigBen

BigBen is a generic, multi-tenant, time-based event scheduler and cron scheduling framework based on Cassandra and Hazelcast

It has following features:

Distributed - BigBen uses a distributed design and can be deployed on 10's or 100's of machines and can be dc-local or cross-dcHorizontally scalable - BigBen scales linearly with the number of machines.Fault tolerant - BigBen employs a number of failure protection modes and can withstand arbitrary prolonged down timesPerformant - BigBen can easily scale to 10,000's or even millions's of event triggers with a very small cluster of machines. It can also easily manage million's of crons running in a distributed mannerHighly Available - As long as a single machine is available in the cluster, BigBen will guarantee the execution of events (albeit with a lower throughput)Extremely consistent - BigBen employs a single master design (the master itself is highly available with n-1 masters on standby in an n cluster machine) to ensure that no two nodes fire the same event or execute the same cron.NoSql based - BigBen comes with default implementation with Cassandra but can be easily extended to support other NoSql or even RDBMS data storesAuditable - BigBen keeps a track of all the events fired and crons executed with a configurable retentionPortable, cloud friendly - BigBen comes as application bundled as war or an embedded lib as jar, and can be deployed on any cloud, on-prem or public

Use cases

BigBen can be used for a variety of time based workloads, both single trigger based or repeating crons. Some of the use cases can be

Delayed execution - E.g. if a job is to be executed 30 mins from nowSystem retries - E.g. if a service A wants to call service B and service B is down at the moment, then service A can schedule an exponential backoff retry strategy with retry intervals of 1 min, 10 mins, 1 hour, 12 hours, and so on.Timeout tickers - E.g. if service A sends a message to service B via Kafka and expects a response in 1 min, then it can schedule a timeout check event to be executed after 1 minPolling services - E.g. if service A wants to poll service B at some frequency, it can schedule a cron to be executed at some specified frequencyNotification Engine - BigBen can be used to implement notification engine with scheduled deliveries, scheduled polls, etcWorkflow state machine - BigBen can be used to implement a distributed workflow with state suspensions, alerts and monitoring of those suspensions.

Architectural Goals

BigBen was designed to achieve the following goals:

Uniformly distributed storage model Resilient to hot spotting due to sudden surge in traffic Uniform execution load profile in the cluster Ensure that all nodes have similar load profiles to minimize misfires Linear Horizontal ScalingLock-free execution Avoid resource contentions Plugin based architecture to support variety of data bases like Cassandra, Couchbase, Solr Cloud, Redis, RDBMS, etcLow maintenance, elastic scaling

Design and architecture

See the blog published at Medium for a full description of various design elements of BigBen

Events Inflow

BigBen can receive events in two modes:

kafka - inbound and outbound Kafka topics to consume event requests and publish event triggershttp - HTTP APIs to send event requests and HTTP APIs to receive event triggers.

It is strongly recommended to use kafka for better scalability

Event Inflow diagram

Request and Response channels can be mixed. For example, the event requests can be sent through HTTP APIs but the event triggers (response) can be received through a Kafka Topic.

Event processing guarantees

BigBen has a robust event processing guarantees to survive various failures. However, event-processing is not same as event-acknowledgement. BigBen works in a no-acknowledgement mode (at least for now). Once an event is triggered, it is either published to Kafka or sent through an HTTP API. Once the Kafka producer returns success, or HTTP API returns non-500 status code, the event is assumed to be processed and marked as such in the system. However, for whatever reason if the event was not processed and resulted in an error (e.g. Kafka producer timing out, or HTTP API throwing 503), then the event will be retried multiple times as per the strategies discussed below

Event misfire strategy

Multiple scenarios can cause BigBen to be not able to trigger an event on time. Such scenarios are called misfires. Some of them are:

BigBen's internal components are down during event trigger. E.g.BigBen's data store is down and events could not be fetchedVMs are down Kafka Producer could not publish due to loss of partitions / brokers or any other reasons HTTP API returned a 500 error code Any other unexpected failure

In any of these cases, the event is first retried in memory using an exponential back-off strategy.

Following parameters control the retry behavior:

event.processor.max.retries - how many in-memory retries will be made before declaring the event as error, default is 3event.processor.initial.delay - how long in seconds the system should wait before kicking in the retry, default is 1 secondevent.processor.backoff.multiplier - the back off multiplier factor, default is 2. E.g. the intervals would be 1 second, 2 seconds, 4 seconds.

If the event still is not processed, then the event is marked as ERROR. All the events marked ERROR are retried up to a configured limit called events.backlog.check.limit. This value can be an arbitrary amount of time, e.g. 1 day, 1 week, or even 1 year. E.g. if the the limit is set at 1 week then any event failures will be retried for 1 week after which, they will be permanently marked as ERROR and ignored. The events.backlog.check.limit can be changed at any time by changing the value in bigben.yaml file and bouncing the servers.

Event bucketing and shard size

BigBen shards events by minutes. However, since it's not known in advance how many events will be scheduled in a given minute, the buckets are further sharded by a pre defined shard size. The shard size is a design choice that needs to be made before deployment. Currently, it's not possible to change the shard size once defined.

An undersized shard value has minimal performance impact, however an oversized shard value may keep some machines idling. The default value of 1000 is good enough for most practical purposes as long as number of events to be scheduled per minute exceed 1000 x n, where n is the number of machines in the cluster. If the events to be scheduled are much less than 1000 then a smaller shard size may be chosen.

Multi shard parallel processing

Multi-tenancy

Multiple tenants can use BigBen in parallel. Each one can configure how the events will be delivered once triggered. Tenant 1 can configure the events to be delivered in kafka topic t1, where as tenant 2 can have them delivered via a specific http url. The usage of tenants will become more clearer with the below explanation of BigBen APIs

How to deploy BigBen?

Following are the steps to set up BigBen:

git clone the master branchSet up a Cassandra clustercreate a keyspace bigben in CassandraOpen the file bigben-schema.cql and execute the commands on the cql promptOpen the file bigben.yamlModify the cassandra related propertiesModify the hazelcast related properties add the comma separated IPs of nodes participating in the clustermodify any other property if needed Uncomment the processor.class property if you plan to use kafka for receiving events Go to kafka section and add the appropriate topic nameAdd / Modify any other properties as needed Check the buckets section and set the backlog.check.limit to some value (e.g. set 1440 for one day backlog check limit)If you don't want to use cron module, remove it from under the modules section run mvn clean install -DskipTestscopy the app/target/bigben.war to the server location in all the nodesstart the serverwait for sometime, then call GET /events/cluster API and see that a master is chosen

APIs

cluster

GET /events/cluster

response sample (a 3 node cluster running on single machine and three different ports (5701, 5702, 5703)):

{ "[127.0.0.1]:5702": "Master", "[127.0.0.1]:5701": "Slave", "[127.0.0.1]:5703": "Slave"}

The node marked Master is the master node that does the scheduling.

tenant registration

A tenant can be registered by calling the following API

POST /events/tenant/register

payload schema

{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "tenant": { "type": "string" }, "type": { "type": "string" }, "props": { "type": "object" } }, "required": [ "tenant", "type", "props" ]}

tenant - specifies a tenant and can be any arbitrary value. type - specifies the type of tenant. One of the three types can be usedMESSAGING - specifies that tenant wants events delivered via a messaging queue. Currently, kafka is the only supported messaging system.HTTP - specifies that tenant wants events delivered via an http callback URL.CUSTOM_CLASS - specifies a custom event processor implemented for custom processing of events props - A bag of properties needed for each type of tenant. kafka sample:

{ "tenant": "TenantA/ProgramB/EnvC", "type": "MESSAGING", "props": { "topic": "some topic name", "bootstrap.servers": "node1:9092,node2:9092" }}

http sample

{ "tenant": "TenantB/ProgramB/EnvC", "type": "HTTP", "props": { "url": "http://someurl", "headers": { "header1": "value1", "header2": "value2" } }}

fetch all tenants:

GET /events/tenants

event scheduling

POST /events/schedule

Payload - List

EventRequest schema:

{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "id": { "type": "string" }, "eventTime": { "type": "string", "description": "An ISO-8601 formatted timestamp e.g. 2018-01-31T04:00.00Z" }, "tenant": { "type": "string" }, "payload": { "type": "string", "description": "an optional event payload" }, "mode": { "type": "string", "enum": ["UPSERT", "REMOVE"] } }, "required": [ "id", "eventTime", "tenant" ]}

find an event

GET /events/find?id=?&tenant=?

dry run

POST /events/dryrun?id=?&tenant=?

fires an event without changing its final status

cron APIs

coming up...

标签：js

敏捷交付如何驱动企业在快速变化的市场中获胜

661 2022-10-12

BigBen - 一种通用的，多租户，基于时间的事件调度程序和cron调度框架（bigben烟斗）

前端框架选型是企业提升开发效率与用户体验的关键因素

大屏前端框架如何推动企业数据可视化与用户体验的革新

敏捷交付如何驱动企业在快速变化的市场中获胜

最近发表

更多内容

小程序SDK

Finclip技术文档

小程序开发

小程序容器

小程序框架

Finclip小程序平台

Finclip用户投稿

车联网

推荐文章

小程序SDK是什么意思？小程序sdk和插件有什么区别？

小程序支付功能怎么实现？

企业app开发流程是什么？

app运营模式有哪些？

小程序多端引流怎么做？

小程序生态分析的机会和威胁

Flutter入门这一篇效率文章就够了

原生与跨平台解决方案分析,跨平台软件开发技术方案

热更新技术：让软件更新变得更加轻松快速

解决方案

银行解决方案

证券解决方案

互联网解决方案

政企OA解决方案

科技解决方案

loT解决方案

信任解决方案

热评文章

AppCan:基于混合模式的移动应用开发,移动混合模

Hybrid App混合模式开发的了解

小程序容器技术助力券商数字营销突围，小程序容器化的意

用mpvue开发微信小程序基础知识（vue.js开发

小程序多端框架全面测评对比，强烈推荐！

券商app架构 - 解析券商应用程序的构建与设计