Original post

Yeah, a social graph is the traditional example, because as soon as you talk graphs, people are like “Oh, Facebook, and Twitter”, and all that stuff… Which is great. It actually works pretty well for those use cases. But in general, that is again a class of knowledge graphs; graphs where you’re storing information. For instance, Dgraph actually comes from technology developed at Google. I think that internally the project was also called Dgraph… Which is weird, but it’s okay, because Google never uses the internal name externally, or something like that, so it was cool…

But yeah, it is the same technology, and the idea is when you’re storing information for movies, and actors, and all that stuff, if you think about the – I think it’s called OneBox, which is when you search something and you have on the right side of the search this extra information telling you the actor, and the movies they’ve been in, and stuff like that… That actually is served by a graph.

So the idea is in order to store that information and being able to retrieve things easily, graphs are the best way to do it. But then on top of that, that knowledge graph could be something that changes really fast. We have seen people doing knowledge graphs on – it’s like a visibility layer more than anything, on top of Kubernetes.

So if you think about Kubernetes, you have services, and pods, and all of these things. They’re all related in many different ways. There are tags, there’s traffic, there’s services sending things around… You can’t visualize that inside of Kubernetes and inside of a graph database, and then query things like “Hey, what are the things that could impact this service if this service goes down or the other way around if this service goes down what pieces of my system would be impacted?” And that is a graph problem.

There’s actually an open source project created by VMware. It’s called Purser. And there’s many others. There’s things like actually geographic graphs… We have geolocation, so you can do things like “Find all the hotels that are at less than 50 miles from downtown of San Francisco”, and then from there do more graph stuff. So you can go quite deep into finding things about your dataset that otherwise is just very hard… Because if you think about all of these queries, you could definitely do them anywhere. You could do them on a normal database, on a relational database. The problem is that it’s going to be way, way slower.

[00:16:08.05] I don’t know if you’ve ever used BigQuery. With BigQuery you end up writing queries that run across terabytes of data… And it’s not about the fact that it’s easier to write, it’s just that short developer loop… It’s just much better. You get the feedback and you keep on playing, while if you need to wait for five minutes it’s gonna be way, way slower and more painful.

One other use case that we see very often is since there’s no need for a schema, you can actually integrate a bunch of different datasets together very easily. Dataset integration or dataset identification happens pretty often. Imagine you have a really large telecom company that has been acquiring different companies, and for every single one of those companies acquired probably there’s a user base. There’s an account database with a bunch of different things. And what you wanna do is being able to integrate all of those systems together into one… Graph databases are great for that.

If you think about how you would do it with a relational database, the number of foreign keys that are gonna be flying around is gonna be a pain. So that kind of thing is also very useful. It’s very simple, and it works really well.