Tuesday 21 October 2014

On Google App Engine, Ancestor Queries are Almost Never What You Need


Recently, the company where I work announced the alpha release of Djangae, a compatibility layer that allows your Django application to work on App Engine, and to store your data in the App Engine Datastore. One of the things missing from the alpha was support for the Datastore's so called "Ancestor queries".

The App Engine Datastore is a remarkable feat of engineering. It's a non-relational database, which can scale to store mind-boggling amounts of data and deal with crazy high amounts of traffic. Of course, the sacrifice is that it's non-relational - so there are no joins, aggregate queries or the like. And if you want to count things then expect it to take some time!

Behind the scenes your data is seamlessly replicated and distributed across Google servers, which makes it extraordinarily reliable and performant.

Google achieves this by dividing your entities into "entity groups". The Datastore allows you to mark entities as being in a group by specifying their Ancestor when you create them. When you do this, the path to the root ancestor forms part of your entity's primary key and each member of the tree is part of the same entity group. Each entity group has its own index, and updates within the entity group are consistent.

If you edit an entity, and then perform a query for it using the entity's Ancestor then your results are strongly consistent. However, if you query without specifying the entity's Ancestor, your results will likely be stale. Your query might return entities that have been deleted, might fail to return new entities, or might return entities with stale data. This is because when you don't specify an ancestor, your query will look at the global index of all entities which is not strongly consistent, it's eventually consistent (updates will lag for a few seconds). Eventual consistency is a bitch to work with.

When you perform an ancestor query, the Datastore only looks at the entity group's local index, so you've already eliminated most of the entities in your datastore except for the ones below the specified ancestor. Which is why consistent results are possible.

There are a bunch of drawbacks with using Ancestor queries though, these are:
  • All entities within the same group are limited to an overall total of 1 write per second
  • To look up a descendant by key, you need to know the entire path to the root ancestor
  • Keys just get confusing (does it have a parent? Can I look it up by kind and ID?)
  • Moving descendants around means destroying and recreating them, and transferring any references to their old key
These are some pretty annoying drawbacks, and I often wonder why Google decided to make the entity group part of the key, rather than having another ID property built into each entity (e.g. __group__) which would alleviate some of the issues. But then again I didn't build the Datastore, I'm sure there are reasons.

Anyway, there is an alternative to ancestor queries, which works better in nearly all situations; you can maintain a list of child IDs on the parent object. App Engine's list properties allow you to store up to 500 items (e.g. integers) in a single field. These items are indexed. By storing the IDs of related child entities on a logical parent, you gain the following things:
  • Each entity has its own entity group, suddenly the write rate isn't so bad
  • You can get all child entities consistently, at once by doing a Get rather than a query (you can then filter them in memory)
  • You can query for the parent object by child ID (as list properties are indexed)
  • Your entities can all be looked up with their ID
  • Migrating child entities to a different parent just means updating the list on the parent
Djangae will automatically transform a PK filter into a datastore Get. Which means MyModel.objects.filter(pk__in=parent.child_ids, username="bananas") would do a consistent Get, then only return the results where the username was 'bananas'. You can also use post-save/delete signals etc. to keep the list up to date.

This is why Djangae doesn't have ancestor support yet, in nearly every situation that I've ever come across denormalizing child IDs has always been a better solution than building an ancestor tree. We'll get ancestor support into Djangae in time, but don't wait for it, just smartly structure your data.

15 comments:

  1. Explore our blog to find out, get trending news and read articles dedicated to different mobile apps for parents, students and kids.

    ReplyDelete
  2. Having a good username on TikTok is so important. Because if it will attractive and catchy then it helps to gain follower. Because when someone like your video then he tells to his friends and others about you and suggest them to follow you and if your username is simple and awesome then It's a big advantage for you. So you can find many Cool TikTok Username here.

    ReplyDelete
  3. Roblox is a big multiplayer online game creation system. Where an user can create his own game and play it. You can create many types of games in Roblox. Roblox has 100 million monthly active users in all over the world. If you also a user of Roblox then you will definitely like these Roblox memes.

    ReplyDelete
  4. If you are searching for Minecraft usernames then your search ends here. If you want a catchy, attractive and good minecraft username then you can find many Attractive Minecraf tUsername here.

    ReplyDelete
  5. Thank you for the sharing this post
    more info here click here

    ReplyDelete
  6. nice blog. helpful. thanks for sharing with us.
    My Response

    ReplyDelete
  7. I really like what you guys are up too. This sort
    of clever work and exposure! Keep up the amazing works guys I’ve included you guys to my personal blogroll.
    Best Entrepreneurship Quotes

    ReplyDelete
  8. I really like what you guys are up too. This sort
    of clever work and exposure! Keep up the amazing works guys I’ve included you guys to my personal blogroll.
    Best Entrepreneurship Quotes

    ReplyDelete
  9. Hey Buddy, Its Nice Blog Nice Contents, found informative. Your blog is very impressive Check These guys out keep posting such useful information. I Am Also A Blogger At This Site Check This Out

    ReplyDelete
  10. blog. Your article is so convincing that I never stop myself to say something about it. You’re doing a great job. Keep it up.
    Why Not Try This Out

    ReplyDelete
  11. I Really Enjoyed reading the Post above, really explains everything in detail, the blog is very interesting and effective. Thank you and good luck with the upcoming articles. Why Not Try This Out I Am Damn Sure You Will Really Love This Article.

    ReplyDelete
  12. Excellent read, Positive site, where did u come up with the information on this posting? A Fantastic Read I have read a few of the articles on your website now, and I really like your style.

    ReplyDelete
  13. I really appreciate this wonderful post that you have provided for us
    I must say you always share the best blogs with us and I am very much impressed with your great work.
    Managing a current account
    Does opening a new credit card
    Breaking up with a credit card
    How many credit cards you should

    ReplyDelete