Indexer as a service. Proof of concept

UPD Mar 30, 2022:

We’ve created a new way of creating indexers. Please, meet NEAR Lake

Problem description:
Currently, at NEAR we have a tool NEAR Indexer Framework you may be familiar with.

I know a few teams who built indexers using Indexer Framework:

But a lot more teams avoid building indexers. The ones who built struggle.

Running an indexer is almost the same as running a NEAR node but you also have custom code and you can’t get binaries on nearcore updates. You need to maintain a “custom NEAR node” of some sort.

It might be expensive, it requires time and money investments and it distracts you from your business.

Solution we see:
To create Indexer as a Service.

We want to allow our users to focus on their business. We can achieve it by providing a tool with which they can:

  • benefit from indexer features
  • avoid running a NEAR node
  • decrease time and money investments for maintenance
  • focus on the business, not the tools.

We decided to start by providing simple yet useful tooling to check if users may want it.

Indexer as service nodes will store every single block on some blob storage (AWS S3 or alternative) in JSON format named by block hash. An API endpoint in Indexer as a service will return any block by its hash.

We’re going to define a set of triggers that essentially are going to represent some sort of events from the Network:

  • receiver_id is observed (receiver_id is user-provided)
  • The balance of the account has changed (account is user-provided)
  • Epoch has changed
  • etc.

It is considered that there are up to 20 types of events that can interest application developers.

In case of any event, a POST request with a relevant piece of data and block hash will be sent to a user-provided endpoint. By “relevant piece of data” we mean the piece from the block with something that corresponds to the user-selected event. For example, if a user is watching for “receiver_id is observed” and Indexer as service notices a Receipt with receiver_id provided by the user a POST request with block hash and the Receipt will be sent. All other transactions, receipts, and execution outcomes will be skipped for the user as non-relevant. The user will be able to request the entire block from blob storage via the Indexer as a service API endpoint.

The PoC will look as following (from the user perspective):

[To be discussed additionally] A user buys Indexer as a service fungible tokens on the service’s contract.

In the service contract, a user will set which predefined event they are interested in. They will provide a set of parameters relevant to the event and the endpoint to send a POST request. For security reasons, the endpoint might be encrypted with a service-provided public key.

Once the event is emitted on the network a POST request is sent to the user’s endpoint. The user is charged for checking every block against the user’s “filters” (a small yet reasonable amount to cover the CPU cost), either number of requests or the number of bytes sent to them.

Contract UI

Draft of the contract UI (made quick and dirty): https://www.figma.com/file/kc2Gnb333T98IyMslKhezC/Indexer-Service-Contract-UX-draft?node-id=0%3A1

Basically the flow:

  1. A user logs in with NEAR Wallet
  2. A user see all the so-called filters created earlier (if any)
  3. A user can toggle filters w/o deleting them
  4. A user can buy “Indexer coins” to spend them on checks and events
  5. A user can create a filter

Contract UX:

As for the contract (it’s all are about to be discussed and clarified though)

Assuming a user can buy 1000 of Indexer coins to spend on checks and requests for 1NEAR.

A user is buying coins and submits their filter. Indexer Service keeps track of the user balance and makes a decision to include/not include the user filters on every block handler step.

If the balance is positive Indexer Service checks the block against user filters. Right after the check, the service increments the number in an “invoice”.
If the event triggers the user filter and the service needs to send a request it checks the user balance again. If it is enough for a request then the service increments the number in the “invoice”.

Once in 10 minutes (think of it like the service epoch) the service performs a function call to the service contract owner method (using FUNCTION_CALL access key) to send the invoice. Once the invoice is received the contract decreases balances for users specified in the invoice.

Obviously, on the service side, the invoice is cleared and we start over a new 10 minutes cycle.

P.S. If the user balance reaches zero in the middle of the 10 minutes cycle the service just skips user filters.

How to handle it?

We imagine a user might have a simple HTTP-server that is ready to receive a specific JSON structure from the Indexer Service. After some manipulation with the data (if necessary) the handler can send a message to Telegram BOT or to store it in some MongoDB base for further use. Those handlers could even be hosted on AWS Lambda functions or alternatives.

The user code will also be able to pull the previous and the next blocks to fill up potentially missing information (e.g. cross-contract calls) or perform RPC calls to fetch relevant details if the data is pushed through the POST request from Indexer as a service is not sufficient.

Implementation of PoC plan:

  • Choose one of the possible events to implement
  • Write a contract for Indexer Service:
    • Fungible tokens
    • Possibility to choose an event and set necessary parameters
    • User charging
  • Indexer Service node
    • Read the contract state for events to track and send
    • Listen to every block for tracking events
    • Store block in blob storage
    • Send requests to users with tracking events
    • Function call to IaaS to charge users
  • Frontend
    • Indexer Service contract frontend
    • Login
    • Buy IaaS fungible tokens
    • CRUD events
    • Enable/Disable events

We encourage those who build their indexers to share their thoughts if this project might be interesting to you.
We encourage those who avoid building their indexers to share their use cases so we can empower them with a proper tool.
And we encourage everyone to share their thoughts about such a service.

  • Are you interested?
  • Do you need something like that?

Thank you!

13 Likes

This is an awesome idea and I definitely think that members of the community will be interested and use this service! I’ve worked with NEAR for a little under a year now and I personally had trouble setting up our indexer (which is open source if anybody wants to use it).

I feel as though there is a relatively high barrier to entry for new users and it can definitely feel overwhelming trying to understand how everything works. A very generic use case that a lot of users might have is keeping their own databases synced with the blockchain. For this, they would need to run an indexer that listens for events and updates their database accordingly.

Users might not care about how the indexer works and simply want to be able to use it. The proposed PoC is a great way to tackle this issue.

The only question I have is related to performance - what are the performance drawbacks of using the indexer as a service vs. running your own indexer.

1 Like

We don’t expect any performance drawbacks. Actually, our users will pay for the service to avoid complexity and additional drawbacks. However, we won’t be able to guarantee good performance on the user’s side (where the handlers are)

2 Likes

What do we want to achieve with the smart contract that charges for the service? It seems to me that handling it through smart contracts is expensive and also incurs additional overhead related to security and privacy.

Financial side of things and use the contract state as a backend/storage already shared between all nodes.

How does this contract work exactly? Do we charge a one-time fee or will there be some sort of recurring payments?

1 Like

Recurring.

We plan to charge a user who has active filters:

  • for each block the Indexer Service checks (it takes resources to check if the data in block matches the user’s filter)
  • if filter matches and a request has been sent we plan to charge for either number of requests sent or for number of bytes sent (not decided yet)

We consider to charge our users in bulk once in 10 minutes.

@Bowen I don’t get your concerns about the contract. IMO it’s the best way to handle financial part of the service. Also, I don’t think having external off-chain database and off-chain payment gateway would be better. On-chain solution for this purpose looks the best from my point of view: stable, reliable, transparent, and showing the power of Open Web.

1 Like

How does the UX work here? Could you explain the workflow in more details?

To be clear, I am not against this idea per se. I just want to better understand how it works in practice.

@Bowen I’ve created UX draft quickly. You can see it here https://www.figma.com/file/kc2Gnb333T98IyMslKhezC/Indexer-Service-Contract-UX-draft?node-id=0%3A1
(I’ve added the link to the original post)

Basically the flow:

  1. A user logs in with NEAR Wallet
  2. A user see all the so called filters created earlier (if any)
  3. A user can toggle filters w/o deleting them
  4. A user can buy “Indexer coins” to spend them on checks and events
  5. A user can create filter

Yeah, I’ve just realized you probably are not interested in UI :slight_smile:

As for the contract (it’s all are about to be discussed and clarified though)

Assuming a user can buy 1000 of Indexer coins to spend on checks and requests for 1NEAR.

A user is buying coins and submits their filter. Indexer Service keeps track of the user balance and makes a decision to include/not include the user filters on every block handler step.

If the balance is positive Indexer Service checks the block against user filters. Right after the check, the service increments the number in an “invoice”.
If the event triggers the user filter and the service needs to send a request it checks the user balance again. If it is enough for a request then the service increments the number in the “invoice”.

Once in 10 minutes (think of it like the service epoch) the service performs a function call to the service contract owner method (using FUNCTION_CALL access key) to send the invoice. Once the invoice is received the contract decreases balances for users specified in the invoice.

Obviously, on the service side, the invoice is cleared and we start over a new 10 minutes cycle.

P.S. If the user balance reaches zero in the middle of the 10 minutes cycle the service just skips user filters.

I really hope I’ve explained it clearly. Feel free to ask questions if my explanation is somehow messy.

1 Like

Thanks for the explanation. One more question: what is the workflow to ask users to top up their balances? Do we notify the users when their balances are low?

1 Like

It’s a great question. I’ve forgot about it. Though it’s solvable. I thinks we can come up with idea how do to it later.
We can go with TG bot or add email field for notifications (encrypted). I don’t think it is a blocker for proof of concept. What do you think?

2 Likes

Yeah sounds good to me

2 Likes

With a good indexer (low-cost, complete, fast, …), there would be an opportunity to build a special Explorer, a Where-Used/What-Used Explorer. All (related) data can be visualised in one click, without any further explicit searching or other actions to get to the final result. Only further navigation is needed to get to the result. It can/will open the blockchain to the user… for now, the Classic Explorers are the only small windows (with limited visibility) to the blockchain for end-users. These Explorers need frequent input of search arguments and selections from the user, who needs to be aware of the structure and interaction of the blockchain data.

From each entry-point (account, action, address, date, status, token, transaction, …), each result is only a few steps away, completely linked together.
The special/specific user-interface should do the most of the remaining job… after indexing and scrambling, exploring within a dynamic grid (…,parents,root,children,…) and navigation on the edge of web-standards (COULD BE A REAL BLOCKING PROBLEM !!!).

Maybe not needed for all data (blocks, chunks, access_keys), but very useful for other specific data in an end-user context (accounts, wallets, balances, transactions, receipts, staking, history, collectibles, nft’s, tokens, dates, …).

As a result, NEAR would have a special blockchain explorer, with aspects of a personal organiser/wallet. Performance issues and web interface issues, but perfectly doable and it will open the blockchain to the end-user.
Aspects of personal organiser are memo’s, alerts, visual marks, duplicating and reordering, filtering , remembering navigation paths and more.
The user can organise his blockchain data and also other selected blockchain data in a way that helps him to better control his wallet.

Like a flat data file can have another dimension when imported in a spreadsheet, a filtered subset of a blockchain can better been displayed through a dynamic grid.

With a dynamic grid, I mean a grid that dynamically adds and removes columns and each column can display a hole filtered blockchain. The columns are dynamically created and linked per cell/item and the whole looks like an open book.

The book is virtually unlimited in size (cfr. blockchain), where each data-item/cell can become the root with his parents/grand-parents/… and children/grandchildren/… in a small configurable limited physical grid (from 7P > 3P/R/1C > 5C or the smallest grid with only 3 columns (3P > 2P/R > P/R/C > R/2C > 3C) where P=Parent Column, R=Root Column, C=Child Column. The grid scales/stretches differently in a Web page (7 columns or more), a mobile app (3 columns) or just 1 column if the wallet has lost it’s metadata. The last situation is comparable with the current “Recent Activity” backed by the Explorer via “View All”.

  1. Indexer as a Service.
  2. Scrambler as a Service.
  3. Dynamic Grid Explorer.
  4. Optional Organiser (with small local db needs).

I have included a simulation with really bad test data by means of 7 screenshots, that can simulate the grid navigation in a limited way.
The data is not really related, but just forced to be used as position fillers, to demonstrate the nature of the grid.
It would be much better with real data, because it’s all about the data. Because the data is so bad, it’s difficult to follow the dependancies/links by a first navigation. It just gives an impression of what I mean with a dynamic grid, a grid that generates the content of the next column for each individual selected cell/object. In the real grid every cell can become the next new root, with his parents/children and so, the navigation starts over without a limitation.

If you select one image, and then use the left and right arrows in a correct way (the left arrow, when going left until 3/green and the right arrow, when going to the right until 3/blue), you get an almost realistic simulation of the navigation.

The screenshots are not all very consistent, I mean they change the current selected cell in some cases (in the columns 1/green and 1/blue), this is not correct, but the result of making some mistakes/retries when taking the shots. And I was not willing to redo the previous shots. Its all about the global picture, don’t look at the details.







Hey! What’s the status of this project? Or is anything similar already up? I’m looking for a way to be able to scan the blockchain for NFT transactions tied to a predefined smart contract. I think an indexer like this is pretty much the exact thing I need

Sorry for the late reply. We’re going to announce a similar project very soon, stay tuned, please.

And here’s the announcement I’ve mentioned in a previous post. [Announcement] NEAR Lake Framework - brand new word in indexer building approach