What do others use for storing lots of data?

Hey Panda gurus, I am working on a minecraft like project and I wanted to test out a performance tuning idea! I use MySQL at work all day every day and love how fast it is. I can execute a query affecting millions of rows in a matter of minutes, and a query against a handful of rows in a fraction of a second.

However, for a game that will be running on someones computer, I don’t really want to use MySQL as it would require me to package it up and install it as part of the install process, import default schema and everything else. I don’t want to do that to an end user’s computer.

I started looking to see what Panda offered natively. I created a test object that was a perlin noise function at a size of 500x500x50 and used pickle to write the dictionary I stored that into a file. The perlin noise function is used to generate terrain and store a value for every x,y,z value in the game world. When I used pickle to write this to a file, the resulting pkl file was nearly 400mb! :open_mouth:

That’s not going to work! But, I didn’t really want to do that anyway. I would like to find a small database like python extension that I can use to quickly look up a value when I pass in an x,y,z value to look up because I don’t want to store this in memory. But if someone plays just 3 generated levels they’re already up to over a gig of space used!

Does anyone have any suggestions for something that will do the following?

  • Keyed indexing
  • Once the python is converted to an executable, the functionality needs to still work without external dependencies (e.g. MySQL)
  • Small output file size
  • Extremely fast lookup speed

Any good/bad/creative suggestions?



Haha, I was coming back to update this with “I just remembered that sqlite is built in to python!”


I am just wondering how many rows you would eventually have in your game. Like you said millions of rows takes minutes to process. I also have some experience with using data bases to hold large amounts of data (I think the largest was around half a million) and it did not work out, it was just too slow. I am not sure about how you are making your game but if you want to access your database frequently then I think that it would slow down your game significantly due to the constant hard drive I/O.

for some extra information I used postgres (due to the license of MYSql) as the database. There were mixed reports about its speed compared to other databases, so if MYSql is faster then ignore everything I said.

if you write a 500x500x50x1byte data block out as binary file you end up with bout 12 megs of data.

word of advice, dont save stuff you dont need. a perlin function can deliver the same precise results over and over again, so there is no need to store the entirety of it.
if you are looking into storing stuff like volumetric data i highly recommend to switch to more efficient storing techniques like adaptive octrees or so.

break it into tiles like many terrain systems do and attach a small SQL Lite DB to each area of terrain, now all you need is to search the ones near where you are looking for the object or affecting some object the others are effectively “out of scope”, thats what I’m going to do. You will still have a lot of data but hopefully its sparsely populated.

Lord Gegoro Kitsune