firebolt: 用于流事件处理和数据管道应用程序的Golang框架(firebolt snowflake)

网友投稿 835 2022-10-11

firebolt: 用于流事件处理和数据管道应用程序的Golang框架(firebolt snowflake)

firebolt: 用于流事件处理和数据管道应用程序的Golang框架(firebolt snowflake)

A golang framework for streaming event processing & data pipeline apps

Introduction

Firebolt has a simple model intended to make it easier to write reliable pipeline applications that process a stream of data.

It can be used to build systems such as:

logging/observability pipelinesstreaming ETLevent processing pipelines

Every application's pipeline starts with a single source, the component that receives events from some external system. Sources must implement the node.Source interface.

We provide one built-in source:

kafkaconsumer - Events come from a Kafka topic, and are passed to the root nodes as []byte

The processing of your application is executed by its nodes which form a processing tree. Data - events - flow down this tree. A parent node passes results down to it's child nodes. Nodes may process events synchronously or asynchronously, and must implement the node.SyncNode or node.AsyncNode interfaces accordingly.

We provide two built-in node types:

kafkaproducer - Events are produced onto a kafka topic by an asynchronous producer.elasticsearch - Events are bulk indexed into Elasticsearch.

Example: Logging Pipeline

At DigitalOcean, our first use of Firebolt was in our logging pipeline. This pipeline consumes logs from just about every system we run. The diagram below depicts the source and nodes in this application.

This system uses the built-in kafkaconsumer source (in yellow) and kafkaproducer and elasticsearch nodes (in green). The blue nodes are custom to this application.

What does Firebolt do for me?

Firebolt is intended to address a number of concerns that are common to near-realtime data pipeline applications, making it easy to run a clustered application that scales predictably to handle large data volume.

It is not an analytics tool - it does not provide an easy way to support 'wide operations' like record grouping, windowing, or sorting that require shuffling data within the cluster. Firebolt is for 'straight through' processing pipelines that are not sensitive to the order in which events are processed.

Some of the concerns Firebolt addresses include:

kafka sources Minimal configuration and no code required to consume from a Kafka topic, consumer lag metrics includedkafka sinks Same for producing to a Kafka topicloose coupling Nodes in the pipeline are loosely coupled, making them easily testable and highly reusablesimple stream filtering Filter the stream by returning nil in your nodesconvenient error handling Send events that fail processing to a kafka topic for recovery or analysis with a few lines of configoutage recovery: offset management Configurable Kafka offset management during recovery lets you determine the maximum "catch up" to attempt after an outage, so you can quickly get back to realtime processing.outage recovery: parallel recovery After an outage, process realtime data and "fill-in" the outage time window in parallel, with a rate limit on the recovery window.monitorability Firebolt exposes Prometheus metrics to track the performance of your Source and all Nodes without writing code. Your nodes can expose their own custom internal metrics as needed.leader election Firebolt uses Zookeeper to conduct leader elections, facilitating any processing that may need to be conducted on one-and-only-one instance.

Documentation

Configuration The configuration file format Execution How Firebolt processes your data Registry Adding node types to the registry Sample Application Code Example code for running the Firebolt executor Sources Implementing and using sources Sync Nodes Implementing and using synchronous nodes Async Nodes Implementing and using asynchronous nodes Leader Election Starting leader election and accessing election results Messaging How to send and receive messages between the components of your system Metrics What metrics are exposed by default, and how to add custom metrics to your nodes

Built-In Types

Kafka Producer Node for producing events onto a Kafka topic Elasticsearch Node for indexing documents to an Elasticsearch cluster

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Makefile编写详解--项目开发
下一篇:高效解析xml的总结,闲下来写的
相关文章

 发表评论

暂时没有评论,来抢沙发吧~