Snappy, the Snaplet cat mascot uses subsetting to get production-like data, instead of the old postgres pg_dump

Get your development data into shape! 🏋️‍♀️🏃‍♀️

Why Snaplet is a better data dump

Here at Snaplet, we’re not about to tell you that you need to get into shape. Our mascot Snappy is a cat, and cats are notoriously exercise-averse, so it’d be uncharacteristic of us to do so. What is our business is what kind of shape the data you code against is in though! We’re here to help you get your data ship-shape 🚢, so you can focus on shipping! 🚀 (See what we did there?)

We all know that coding against bad data is better than coding against no data, but coding against accurate, realistic data is way better than coding against bad data. Getting good data is easier said than done - how many times have you struggled to debug a complicated issue without having access to production data? Or battled with a dodgy seed script to get data into your development environment? Struggle no more! Snaplet makes it super easy to get your data into the right shape for you to code against.

There’s a couple of different ways you can do this, depending on your particular situation. The first way is the most obvious way: snaplet snapshot capture. Rather than seeding your dev database with a seed script, you can simply use Snaplet to connect to your production database, and capture and restore a snapshot of that database to your development environment. Snaplet’s snapshot capture allows you to go from no data locally, to sanitized production data in seconds. If you’re snapshotting production, the benefit is that the data you code against locally is always realistic and accurate. The shape of your local data matches the shape of your production data!

Sounds great, but are you worried about connecting Snaplet to your production database? We get it - that’s why you can self-host Snaplet to securely connect to production from within your own environment. Self-hosting means we never touch your production database, and you control the entire process, while still getting high quality, production-realistic snapshots to code against.

Coding against production data is great, but what if your production database is too big and unwieldy? We’ve seen production databases north of 200 GB, which is why we’ve introduced subsetting - capture a representative sample of your database, with all the relationships between your tables preserved for your subset. Subsetting allows you to easily work with a smaller dataset, but can also be useful for debugging: you could pull out a single customer record, with all linked data, to debug a very specific problem. It’s then also super easy to share that tiny snapshot with other members of your team to work collaboratively on that problem.

On the other end of the spectrum, what if you don’t have any data? Starting a new project, and don’t want to manually enter a bunch of garbage data into your database, or write a seed script to seed your database?  Snaplet now supports the ability to generate data directly into your database from the CLI using the command snaplet seed. This is also useful if you’re working with a populated database - if you’re working on a feature that adds a new table for instance, you can use snaplet seed to generate data for that table specifically. Here’s Snaplet co-founder Peter Pistorius demoing snaplet seed in action:

Lastly, no matter how you’re getting your data, whether it’s a snaplet snapshot restore from production, or a snaplet seed on a brand new database, you can always transform your data into exactly the shape you need it with Javascript Transformations. When paired with Snaplet Copycat, JS Transformations allow you to replace values in your database with deterministic fake values. For example, you could transform all email addresses and locations of your users table using copycat.email(input) and copycat.country(input) respectively to generate safe, realistic, fake data. This is especially useful if you’re using something that is similar to your production database, like a QA or staging database, but you need to “massage” the data into looking more like actual production data.

The transformations above are always kept in your .snaplet/transform.ts file, but you can use snaplet proxy to “live reload” these changes. Snaplet Proxy runs between your database and database client and allows you to view realtime transformations. It’s great for swiftly validating your transformations, which is useful when you’re transforming data as part of debugging an issue.

No matter what your starting point is - a production database, QA, staging, or even a completely blank database, Snaplet has the tools to get you coding against more realistic-looking and accurate data in a snap! Don’t let "out of shape" data block you, use Snaplet and give your entire team access to production-like data so they can code, debug, and test with ease.

Jian Reis
January 24, 2022