The World of Data Services

a review by Athsrueas.eth | Thomas Freestone

Introduction

This concise review aims to showcase the value that The Graph offers as a data service platform. I'll explore leading SaaS platforms, established statistics tools, Microsoft options, and R/Python solutions, focusing on their suitability for data analysts or anyone interested in data driven decision making. With horizons I hope that many of these services will receive comparable competitors on The Graph protocol and I will be showcasing some that already exist or have been announced by various developers working on The Graph.

Existing Data Services

Whether you're comfortable with code or prefer user-friendly interfaces, this review covers services that fit both niches. If you are in the market for data services you likely already use some of these. I hope that I can introduce you to something new or generate some excitement for the possibilities on The Graph.

Database Platforms: Anyone working with data needs somewhere to keep it

Managing and operating a database "on premisis" is always an option but involves many technical skills that might be worth delegating to another team. There are many DbaaS or Database as a service platforms that are very popular for this purpose. There are many blogs already reviewing the best DbaaS options. There are also lots of potential downsides including obfuscated price structures and general pricing issues due to lack of competition. I am convinced that for many applications a competetive decentralized network will be able to offer more competetive pricing that a traditional DbaaS. This argument is mirrored for each of the following sections. The economic advantage is a big driver of the innovation in this space.

SaaS Platforms: Typically part of a cloud product with various billing styles

User-friendly platforms like Looker and Sisense and tools like Power BI from microsoft, ideal for quick data exploration and visualization. Big names like Amazon, Microsoft, and Google offer cloud based saas platforms. The biggest downside here is that you have little guaruntee of security and privacy. Much of the data you host on these platforms is likely accessible by the administrators of the Cloud platform. Contrasted with a network like The Graph that is utilizing zero knowledge proofs and other cryptographic features to enable private data to be processed by outside machines without comprimising the confidentiality of your data.

Statistics Powerhouses: Often in the cloud, many likely hosted by the big 3

For heavy-duty analysis, many delve into SAS and IBM SPSS. Users can analyze advanced statistical capabilities and deployment options, then assess their suitability for specific data science tasks. Also don't overlook the power of an Excel Workbook. 

Open-Source Options:

R and Python offer flexibility and cost-effectiveness, but come with a steeper learning curve. There are many blogs covering the top libraries for data projects. As with databases many choose to use a hosted platfrom like Google Collab or Jupyter, these usually already have all the libraries you could want installed as well as allow for you to run jobs using cloud computing resources instead of your own machines. This does come with a lack of security or control. There are also open source options for databases like clickhouse. Many of these can and will be used in tandem with the services offered on The Graph.

The Graph: Open Source and Decentralized

The core dev teams like Steamingfast are hard at work developing tools for piping in and out data that will work with many of the tools people are already using like clickhouse. Unlike traditional cloud services and SaaS companies The Graph has many Indexer operators who run independent of each other. This means that there is no one entity, such as Google, that can decide that you cannot use a particular service. There are many individuals that you could try to work with. The Graph developer groups aim to offer many different data services, in my mind it is likely that any platform you currently use for your data processing will have a competitor offered open source and decentralized on The Graph.

The Graph has also attracted lots of interest from those developers on the bleeding edge of Cryptography and Data Services with long-term grants being extended to teams like Semiotic Labs and Streamingfast who have developed best-in-industry technology like Odos and substreams that they are contributing to The Graph. As the network and community grows this only become more true. 

I am really excited about some of the products that the core devs have teased such as the LLM companion agentc.xyz by Semiotic Labs during core dev call 26 at this timestamp. Agentc aims to be a general purpose database search with LLM support which will add lots of power to the SQL and GraphQL already available to the data services on The Graph. This product is an example of the kinds of AI innovations that will interface with the subsreams and deployable units that Streamingfast offers which provides users with a much lower barrier to entry.

Conclusion and Resources

If you want to learn more I recommend following all of the teams mentioned in this blog to get the latest updates, and check out resources like Gartner Magic Quadrant and Forrester Wave. Try The Graph for yourself and see the difference that it can make.

Links

The Graph website - https://thegraph.com/ where you can find currently available services on the hosted and decentralized networks. Here is also a link to the list of repositories in the FAQ.

Notion Core Dev site