Greetings from the Data Platform Team! We are happy and proud to announce an MVP release of a brand new word in indexer building approach - NEAR Lake Framework.
What indexers are used for?
Blockchains are great at providing a way to apply the requested changes to the account state in a decentralized manner. However, in order to observe the changes, you need to actively pull the information from the network. You might have done it through JSON-RPC, but it is not efficient. Instead, we created an abstraction that streams the information block by block to your application, and it is called Indexer.
Let me explain by example. Let’s say you sell e-books via contract on the blockchain. Once a book is bought, you want to send a file to the buyer via email. While you can ask for a customer email to be sent in the transaction, you can’t directly send those emails from the chain. What you can do is, have an off-chain helper with e-book files and email sending implementation. You just need to empower your off-chain part with an indexer that will analyze the incoming blocks to find the approval that an e-book is bought, and trigger the rest of the off-chain pipeline that will send the email.
What’s wrong with the current NEAR Indexer Framework?
There is nothing wrong with NEAR Indexer Framework. It was designed as a minimum wrapper around the nearcore node that still provides a great facility to implement Indexers that are part of the decentralized NEAR network.
However, for the end users who built their own indexers and \run them, becoming a node operator was something unexpected and undesirable. Running the node requires a lot of resources and maintenance is a time-consuming thing.
So if your project required an indexer, you had no choice but to become a node operator and deal with all the pains like block syncing, regular maintenance, keeping up to date with the nearcore releases, etc.
Main disadvantages of using NEAR Indexer Framework:
- You need a lot of resources in terms of hardware and costs
- You constantly need to follow nearcore releases (if you miss the protocol upgrade your indexer will get stuck)
- If for some reason your indexer got stuck and you missed it, you could miss the data and have to handle blockchain data backups to speed up the syncing process for your node
- Syncing process
- Almost impossible to debug locally when you want to check the testnet or mainnet
What is so good about NEAR Lake Framework?
INFO stats: #61265200 Downloading blocks 97.93% (1089267)
If you feel the pain after seeing the line above this announcement you’d love this section!
NEAR Lake Framework is a microframework on top of the S3 storage which makes it easy to build your own indexers! The S3 storage is filled in by the NEAR Lake Indexer that is maintained by the Data Platform team at Pagoda Inc.
Shortly it resolves all the problems listed above.
- MVP version of NEAR Lake Framework consumes ~145MB of RAM and we’re going to improve it
- It doesn’t depend on nearcore and it is not a NEAR node so you don’t need to upgrade it every time nearcore cuts a release
- No syncing process is involved anymore, so even if your indexer got stuck you can restart it from any block and run with it immediately
- There is no huge data folder anymore, so you don’t need to pay for quite expensive 1TB SSD drives
- Want to debug your indexer on mainnet locally? No problem, do whatever you need
How to use it?
In order to answer this question, we have prepared the video tutorial with a simple example to give you an overview and some practical ideas.
How does it work?
The project consists of two pieces. The first one is NEAR Lake which is an old-school indexer built on top of NEAR Indexer Framework. We run this indexer on our own, it indexes all the blockchain data and stores it in AWS S3 buckets (testnet, mainnet).
The buckets are configured in such a way that the requester pays for their usage which enables you to consume the data from AWS S3 buckets and pay to AWS by yourself.
And the second piece is NEAR Lake Framework which is a Rust library that allows you to build your own indexer in a similar way that NEAR Indexer Framework does but it reads the blocks data from the AWS S3 bucket instead of an embedded NEAR node so the data stream is getting available immediately on start! You only need to provide the AWS credentials so you can be charged for the reading access.
By the way, we consider implementing the NEAR Lake Framework library in JavaScript as our next step. Let us know if you are interested in it, so we can keep you updated! Subscribe to our newsletter here.
Running your own indexer has become easier than ever before. We encourage you to create your indexers, we encourage you to migrate your indexers to NEAR Lake Framework if you already run one. Leave your questions or feedback in this thread, if you have issues building or migrating to NEAR Lake, we encourage you to open issues in the repositories or ask on StackOverflow with tag “nearprotocol”
Links:
- GitHub - near/near-lake-framework-rs: Library to connect to the NEAR Lake S3 and stream the data NEAR Lake Framework official repo
- GitHub - near/near-lake-indexer: Watch NEAR network and store all the events as JSON files on AWS S3 NEAR Lake indexer repo (this one the Data Platform Team is running)
- GitHub - near-examples/near-lake-raw-printer: An example of NEAR Lake Framework usage that prints the raw data from the stream simple example of a data printer built on top of NEAR Lake Framework
- GitHub - near-examples/near-lake-accounts-watcher: A source code for a video tutorial on how to use the [NEAR Lake Framework](https://github.com/near/near-lake-framework) another simple example of the indexer built on top of NEAR Lake Framework for a tutorial purpose
- Newest 'nearprotocol' Questions - Stack Overflow “nearprotocol” StackOverflow
- Subscribe to our Pagoda Dev Newsletter to keep up with the most recent product updates!